کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
514961 866921 2015 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Generic method for detecting focus time of documents
ترجمه فارسی عنوان
روش عمومی برای تشخیص زمان تمرکز اسناد
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• Statistical approach for estimating the focus time of text documents.
• Classification framework for categorizing documents into temporal and atemporal.
• Bi-Temporal Document Representation using document focus time and creation time.

Time is an important aspect of text documents. While some documents are atemporal, many have strong temporal characteristics and contain contents related to time. Such documents can be mapped to their corresponding time periods. In this paper, we propose estimating the focus time of documents which is defined as the time period to which document’s content refers and which is considered complementary dimension to the document’s creation time. We propose several estimators of focus time by utilizing statistical knowledge from external resources such as news article collections. The advantage of our approach is that document focus time can be estimated even for documents that do not contain any temporal expressions or contain only few of them. We evaluate the effectiveness of our methods on the diverse datasets of documents about historical events related to 5 countries. Our approach achieves average error of less than 21 years on collections of Wikipedia pages, extracts from history-related books and web pages, while using the total time frame of 113 years. We also demonstrate an example classification method to distinguish temporal from atemporal documents.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 51, Issue 6, November 2015, Pages 851–868
نویسندگان
, , ,