Article ID Journal Published Year Pages File Type
975274 Physica A: Statistical Mechanics and its Applications 2014 9 Pages PDF
Abstract

•Time and scale dependences of correlations of word-length time series were studied.•The scaling exponent exhibited variations along the written text.•Scaling exponent shifts coincide with transitions in narration units.

This work considered the quantitative analysis of large written texts. To this end, the text was converted into a time series by taking the sequence of word lengths. The detrended fluctuation analysis (DFA) was used for characterizing long-range serial correlations of the time series. To this end, the DFA was implemented within a rolling window framework for estimating the variations of correlations, quantified in terms of the scaling exponent, strength along the text. Also, a filtering derivative was used to compute the dependence of the scaling exponent relative to the scale. The analysis was applied to three famous English-written literary narrations; namely, Alice in Wonderland (by Lewis Carrol), Dracula (by Bram Stoker) and Sense and Sensibility (by Jane Austen). The results showed that high correlations appear for scales of about 50–200 words, suggesting that at these scales the text contains the stronger coherence. The scaling exponent was not constant along the text, showing important variations with apparent cyclical behavior. An interesting coincidence between the scaling exponent variations and changes in narrative units (e.g., chapters) was found. This suggests that the scaling exponent obtained from the DFA is able to detect changes in narration structure as expressed by the usage of words of different lengths.

Related Topics
Physical Sciences and Engineering Mathematics Mathematical Physics
Authors
, , , ,