کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
385098 660860 2011 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in Indian languages as case studies
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in Indian languages as case studies
چکیده انگلیسی

In this paper, we propose a simulated annealing (SA) based multiobjective optimization (MOO) approach for classifier ensemble. Several different versions of the objective functions are exploited. We hypothesize that the reliability of prediction of each classifier differs among the various output classes. Thus, in an ensemble system, it is necessary to find out the appropriate weight of vote for each output class in each classifier. Diverse classification methods such as Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM) are used to build different models depending upon the various representations of the available features. One most important characteristics of our system is that the features are selected and developed mostly without using any deep domain knowledge and/or language dependent resources. The proposed technique is evaluated for Named Entity Recognition (NER) in three resource-poor Indian languages, namely Bengali, Hindi and Telugu. Evaluation results yield the recall, precision and F-measure values of 93.95%, 95.15% and 94.55%, respectively for Bengali, 93.35%, 92.25% and 92.80%, respectively for Hindi and 84.02%, 96.56% and 89.85%, respectively for Telugu. Experiments also suggest that the classifier ensemble identified by the proposed MOO based approach optimizing the F-measure values of named entity (NE) boundary detection outperforms all the individual models, two conventional baseline models and three other MOO based ensembles.


► A multiobjective simulated annealing based technique is used for selecting best weights to form a classifier ensemble.
► To the best of our knowledge, use of multiobjective simulated annealing approach to select appropriate weights for voting is a novel contribution, especially in the area of NLP.
► Here we have used several different versions of objective functions.
► The proposed technique is language independent and can be replicated for any resource-poor language very easily.We evaluated our proposed technique for three resource-poor languages, namely Bengali, Hindi and Telugu.
► The proposed framework is applicable for any type of classification problems like NER, Part of Speech (PoS)-tagging, question-answering etc.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 12, November–December 2011, Pages 14760–14772
نویسندگان
, ,