Survey of Boosting from an Optimization Perspective
Survey of Boosting from an Optimization Perspective by Manfred K. Warmuth - Machine Learning Summer School at Purdue, 2011. Boosting has become a well known
ensemble method. The algorithm maintains a distribution on the binary labeled examples and a new base learner is added in a greedy fashion. The goal is to obtain
a small linear combination of base learners that clearly separates the examples. We focus on a recent view of Boosting where the update algorithm for distribution on
the examples is characterized by a minimization problem that uses a relative entropy as a regularization.
The most well known boosting algorithms is AdaBoost. This algorithm approximately maximizes the hard margin, when the data is separable. We focus on recent algorithms
that provably maximize the soft margin when the data is noisy. We will teach the new algorithms, give a unified and versatile view of Boosting in terms of relative
entropy regularization, and show how to solve large scale problems based on state of the art optimization techniques. We also discuss lower and upper bounds on
the number of iterations required for any greedy boosting method and propose a way to circumvent these lower bounds.
Machine Learning Summer School at Purdue, 2011