Article ID Journal Published Year Pages File Type
1891735 Chaos, Solitons & Fractals 2012 9 Pages PDF
Abstract

Should quality be almost a synonymous of complexity? To measure quality appears to be audacious, even very subjective. It is hereby proposed to use a multifractal approach in order to quantify quality, thus through complexity measures. A one-dimensional system is examined. It is known that (all) written texts can be one-dimensional nonlinear maps. Thus, several written texts by the same author are considered, together with their translation, into an unusual language, Esperanto, and asa baseline their corresponding shuffled versions.Different one-dimensional time series can be used: e.g. (i) one based on word lengths, (ii) the other based on word frequencies; both are used for studying, comparing and discussing the map structure. It is shown that a variety in style can be measured through the D(q) and f(α) curves characterizing multifractal objects. This allows to observe on the one hand whether natural and artificial languages significantly influence the writing and the translation, and whether one author’s texts differ technically from each other. In fact, the f(α) curves of the original texts are similar to each other, but the translated text shows marked differences. However in each case, the f(α) curves are far from being parabolic, – in contrast to the shuffled texts. Moreover, the Esperanto text has more extreme values. Criteria are thereby suggested for estimating a text quality, as if it is a time series only. A model is introduced in order to substantiate the findings: it consists in considering a text as a random Cantor set resulting from a binomial cascade of long and short words with appropriate weights. In an appendix, a connection is given with an analysis of turbulence by statistics based on Tsallis generalized entropy. In a second appendix, another view of text (language) complexity is outlined within the copying mistake map concept.

► Two texts in English and one in Esperanto are transformed into 6 time series. ► D(q) and f(alpha) of such (and shuffled) time series are obtained. ► A model for text construction is presented based on a parametrized Cantor set. ► The model parameters can also be used when examining machine translated texts. ► Suggested extensions to higher dimensions: in 2D image analysis and on hypertexts.

Related Topics
Physical Sciences and Engineering Physics and Astronomy Statistical and Nonlinear Physics
Authors
,