کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379294 659286 2007 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Combining data-driven systems for improving Named Entity Recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Combining data-driven systems for improving Named Entity Recognition
چکیده انگلیسی

The increasing flow of digital information requires the extraction, filtering and classification of pertinent information from large volumes of texts. All these tasks greatly benefit from involving a Named Entity Recognizer (NER) in the preprocessing stage. This paper proposes a completely automatic NER system. The NER task involves not only the identification of proper names (Named Entities) in natural language text, but also their classification into a set of predefined categories, such as names of persons, organizations (companies, government organizations, committees, etc.), locations (cities, countries, rivers, etc.) and miscellaneous (movie titles, sport events, etc.). Throughout the paper, we examine the differences between language models learned by different data-driven classifiers confronted with the same NLP task, as well as ways to exploit these differences to yield a higher accuracy than the best individual classifier. Three machine learning classifiers (Hidden Markov Model, Maximum Entropy and Memory Based Learning) are trained on the same corpus in order to resolve the NE task. After comparison, their output is combined using voting strategies. A comprehensive study and experimental work on the evaluation of our system, as well as a comparison with other systems has been carried out within the framework of two specialized scientific competitions for NER, CoNLL-2002 and HAREM-2005. Finally, this paper describes the integration of our NER system in different NLP applications, in concrete Geographic Information Retrieval and Conceptual Modelling.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 61, Issue 3, June 2007, Pages 449–466
نویسندگان
, , , , , ,