"Training ensembles using max-entropy error diversity" by Gary F. Holness and Paul E. Utgoff
 

Computer Science

Training ensembles using max-entropy error diversity

Gary F. Holness, Manning College of Information & Computer Sciences
Paul E. Utgoff, Manning College of Information & Computer Sciences

Abstract

Ensembles provide a powerful method for improving the performance of automated classifiers by constructing piecewise models that combine individual component classifier hypotheses. Together, the combined output of the component classifiers is more capable of fitting the type of complex decision boundaries in data sets where class boundaries overlap and class exemplars are disperse in feature space. A key ingredient to ensemble classifier induction is error diversity among component classifiers. Work in the ensemble literature suggests that ensemble construction should consider diversity even at some expense to individual classifier performance. To make such tradeoffs, a component classifier inducer requires knowledge of the choices made by its peers in the ensemble. In this work, we present a method called MaxEnt-DiSCO that trains component classifiers collectively using entropy as a measure of error diversity. Using the maximum entropy framework, we share information on instance selection among component classifiers collectively during training. This allows us to train component classifiers collectively so that their errors are maximally diverse. Experiments demonstrate the utility of our approach for data sets where the classes have a moderate degree of overlap. © 2009 American Institute of Physics.