کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
471579 698645 2012 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Boosting-based ensemble learning with penalty profiles for automatic Thai unknown word recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Boosting-based ensemble learning with penalty profiles for automatic Thai unknown word recognition
چکیده انگلیسی

A boosting-based ensemble learning can be used to improve classification accuracy by using multiple classification models constructed to cope with errors obtained from their preceding steps. This paper proposes a method to improve boosting-based ensemble learning with penalty profiles via an application of automatic unknown word recognition in Thai language. Treating a sequential problem as a non-sequential problem, the unknown word recognition is required to include a process to rank a set of generated candidates for a potential unknown word position. To strengthen the recognition process with ensemble classification, the penalty profiles are defined to make it more efficient to construct a succeeding classification model which tends to re-rank a set of ranked candidates into a suitable order. As an evaluation, a number of alternative penalty profiles are introduced and their performances are compared for the task of extracting unknown words from a large Thai medical text. Using the Naïve Bayes as the base classifier for ensemble learning, the proposed method with the best setting achieves an accuracy of 90.19%, which is an accuracy gap of 12.88, 10.59, and 6.05 over conventional Naïve Bayes, non-ensemble version, and the flat-penalty profile.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Mathematics with Applications - Volume 63, Issue 6, March 2012, Pages 1117–1134
نویسندگان
, , ,