کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515889 867129 2013 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Named entity recognition with multiple segment representations
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Named entity recognition with multiple segment representations
چکیده انگلیسی


• Different segmentation representations (SRs) cause little difference in performance.
• Different SRs result in quite different outputs.
• Incorporation of different SRs is beneficial to NER task.
• We proposed a new feature generation method that uses multiple SRs.
• The proposed method improves the performance and the stability of NER.

Named entity recognition (NER) is mostly formalized as a sequence labeling problem in which segments of named entities are represented by label sequences. Although a considerable effort has been made to investigate sophisticated features that encode textual characteristics of named entities (e.g. PEOPLE, LOCATION, etc.), little attention has been paid to segment representations (SRs) for multi-token named entities (e.g. the IOB2 notation). In this paper, we investigate the effects of different SRs on NER tasks, and propose a feature generation method using multiple SRs. The proposed method allows a model to exploit not only highly discriminative features of complex SRs but also robust features of simple SRs against the data sparseness problem. Since it incorporates different SRs as feature functions of Conditional Random Fields (CRFs), we can use the well-established procedure for training. In addition, the tagging speed of a model integrating multiple SRs can be accelerated equivalent to that of a model using only the most complex SR of the integrated model. Experimental results demonstrate that incorporating multiple SRs into a single model improves the performance and the stability of NER. We also provide the detailed analysis of the results.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 49, Issue 4, July 2013, Pages 954–965
نویسندگان
, , , ,