Abstract
As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.
| Original language | English |
|---|---|
| Article number | 115895 |
| Journal | Expert systems with applications |
| Volume | 187 |
| Early online date | 16-Sept-2021 |
| DOIs | |
| Publication status | Published - Jan-2022 |
Keywords
- Feature selection
- Boosting
- Ensemble learning
- XGBoost
Fingerprint
Dive into the research topics of 'A framework for feature selection through boosting'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver