Article ID Journal Published Year Pages File Type
569042 Speech Communication 2006 23 Pages PDF
Abstract

Large margin classifiers, such as SVMs and AdaBoost, have achieved state-of-the-art performance for semantic classification problems that occur in spoken language understanding or textual data mining applications. However, these computationally expensive learning algorithms cannot always handle the very large number of examples, features, and classes that are present in the available training corpora. This paper provides an original and unified presentation of these algorithms within the framework of regularized and large margin linear classifiers, reviews some available optimization techniques, and offers practical solutions to scaling issues. Systematic experiments compare the algorithms according to a number of criteria: performance, robustness, computational and memory requirements, and ease of parallelization. Furthermore, they confirm that the 1-vs-other multiclass scheme is a simple, generic and easy to implement baseline that has excellent scaling properties. Finally, this paper identifies the limitations of the classifiers and the multiclass schemes that are implemented.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
,