کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946460 1439291 2016 26 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
NewsReader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
NewsReader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news
چکیده انگلیسی

In this article, we describe a system that reads news articles in four different languages and detects what happened, who is involved, where and when. This event-centric information is represented as episodic situational knowledge on individuals in an interoperable RDF format that allows for reasoning on the implications of the events. Our system covers the complete path from unstructured text to structured knowledge, for which we defined a formal model that links interpreted textual mentions of things to their representation as instances. The model forms the skeleton for interoperable interpretation across different sources and languages. The real content, however, is defined using multilingual and cross-lingual knowledge resources, both semantic and episodic. We explain how these knowledge resources are used for the processing of text and ultimately define the actual content of the episodic situational knowledge that is reported in the news. The knowledge and model in our system can be seen as an example how the Semantic Web helps NLP. However, our systems also generate massive episodic knowledge of the same type as the Semantic Web is built on. We thus envision a cycle of knowledge acquisition and NLP improvement on a massive scale. This article reports on the details of the system but also on the performance of various high-level components. We demonstrate that our system performs at state-of-the-art level for various subtasks in the four languages of the project, but that we also consider the full integration of these tasks in an overall system with the purpose of reading text. We applied our system to millions of news articles, generating billions of triples expressing formal semantic properties. This shows the capacity of the system to perform at an unprecedented scale.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 110, 15 October 2016, Pages 60-85
نویسندگان
, , , , , , , , , , , ,