کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515845 867108 2014 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Soft-constrained inference for Named Entity Recognition
ترجمه فارسی عنوان
استنتاج نرم افزاری محدود برای شناسایی نام
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• Named Entity Recognition is addressed by constraining inference in CRF.
• An two phases integer linear programming approach is proposed.
• Complex relationships among labels are automatically extracted from data.
• Extracted relationships are introduced as soft constraints in the ILP formulation.
• The proposed method significantly outperforms the state of the art approach.

Much of the valuable information in supporting decision making processes originates in text-based documents. Although these documents can be effectively searched and ranked by modern search engines, actionable knowledge need to be extracted and transformed in a structured form before being used in a decision process. In this paper we describe how the discovery of semantic information embedded in natural language documents can be viewed as an optimization problem aimed at assigning a sequence of labels (hidden states) to a set of interdependent variables (textual tokens). Dependencies among variables are efficiently modeled through Conditional Random Fields, an indirected graphical model able to represent the distribution of labels given a set of observations. The Markov property of these models prevent them to take into account long-range dependencies among variables, which are indeed relevant in Natural Language Processing. In order to overcome this limitation we propose an inference method based on Integer Programming formulation of the problem, where long distance dependencies are included through non-deterministic soft constraints.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 50, Issue 5, September 2014, Pages 807–819
نویسندگان
, , , ,