Download PDFOpen PDF in browser

The impact of ensemble diversity on learning big data in dynamic environments

14 pagesPublished: October 25, 2019

Abstract

For many classification tasks, data is collected over an extended period of time and the predictive model learns over time, adapting to changes in the underlying distribution of the data if necessary. To optimize generalization performance, margin distribution is considered to be an important factor. A major concern posed by nonstationary learning for any algorithm is the rate of adaptation to new concepts and the volume of the data. Tackling the problem of learning in nonstationary environments associated with drifting concepts with ensembles of classifiers makes the concept of diversity to be of paramount significance in optimizing the rate of adaptation to new concepts for classification tasks. In this paper, we investigate the impact of ensemble diversity on the rate of adaptation to new concepts in nonstationary learning. The rate of adaptation is analyzed by exploiting the correspondence that exists between voting margins and the double fault measure, a popular diversity measure strongly linked to the margin. We utilize the Adaptive Classifier Ensemble Boost algorithm (AceBoost) to generate diverse base classifiers and optimize margin distribution to exploit different amounts of diversity to generate an optimal ensemble capable of handling different kinds of drift. The experimental results confirm that AceBoost outperforms other state of the art algorithms that exploit ensemble diversity to handle concept drift.

Keyphrases: concept drift, diversity, Ensemble, Margin, Support Vector Machine

In: Kennedy Njenga (editor). Proceedings of 4th International Conference on the Internet, Cyber Security and Information Systems 2019, vol 12, pages 227--240

Links:
BibTeX entry
@inproceedings{ICICIS2019:impact_of_ensemble_diversity,
  author    = {Tinofirei Museba},
  title     = {The impact of ensemble diversity on learning big data in dynamic environments},
  booktitle = {Proceedings of 4th  International Conference on the Internet, Cyber Security and Information Systems 2019},
  editor    = {Kennedy Njenga},
  series    = {Kalpa Publications in Computing},
  volume    = {12},
  pages     = {227--240},
  year      = {2019},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {https://easychair.org/publications/paper/HtW2},
  doi       = {10.29007/6xtn}}
Download PDFOpen PDF in browser