کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946098 1439268 2017 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Long range dependence in texts: A method for quantifying coherence of text
ترجمه فارسی عنوان
وابستگی طولانی در متون: یک روش برای اندازه گیری یکپارچگی متن
کلمات کلیدی
انسجام جهانی، سری زمانی اهمیت مقیاس بندی نماینده،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
This paper discusses a major issue in computational linguistics; the automatic calculation of text coherence. Heretofore, only few methods have been proposed to automatically detect local coherence of texts. All of these methods need a lot of pre-processing tasks and computational efforts. Here we suggest a simple method to evaluate the coherence globally. First, we use a word ranking method to assign an importance value to each word-type in a text, then the importance time series associated with text is constructed. In the next step, Detrended Fluctuation Analysis(DFA) which is used for detecting inherent correlations in time series, is applied to texts importance time series. We found that the importance time series exhibits a bi-scale behavior; it is long-range correlated at large distances, while short-range correlations are observed in small distances. We also observed that for a shuffled text the scaling exponent decreases. This decrease becomes more and more significant when we reshuffle the chapters, paragraphs, sentences and words respectively. This fact leads us to consider the scaling exponent of text time series (or briefly STT) as a measure for quantifying the global coherence. We demonstrate our claim by carrying out an experiment on three sample texts and comparing our method by some entity grid based models.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 133, 1 October 2017, Pages 33-42
نویسندگان
, ,