کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
571789 1439293 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Knowledge base population using semantic label propagation
ترجمه فارسی عنوان
جمعیت پایه دانش با استفاده از انتشار برچسب معنایی
کلمات کلیدی
استخراج رابطه. جمعیت پایه دانش؛ نظارت از راه دور؛ یادگیری فعال؛ یادگیری نیمه‌نظارتی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Training relation extractors for the purpose of automated knowledge base population requires the availability of sufficient training data. The amount of manual labeling can be significantly reduced by applying distant supervision, which generates training data by aligning large text corpora with existing knowledge bases. This typically results in a highly noisy training set, where many training sentences do not express the intended relation. In this paper, we propose to combine distant supervision with minimal human supervision by annotating features (in particular shortest dependency paths) rather than complete relation instances. Such feature labeling eliminates noise from the initial training set, resulting in a significant increase of precision at the expense of recall. We further improve on this approach by introducing the Semantic Label Propagation (SLP) method, which uses the similarity between low-dimensional representations of candidate training instances to again extend the (filtered) training set in order to increase recall while maintaining high precision. Our strategy is evaluated on an established test collection designed for knowledge base population (KBP) from the TAC KBP English slot filling task. The experimental results show that SLP leads to substantial performance gains when compared to existing approaches while requiring an almost negligible human annotation effort.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 108, 15 September 2016, Pages 79–91
نویسندگان
, , , ,