Article ID Journal Published Year Pages File Type
975040 Physica A: Statistical Mechanics and its Applications 2013 6 Pages PDF
Abstract

•We showed the Menzerath–Altmann law describes the distinct word distribution in corpora.•The observation/prediction comparison shows excellent accuracy.•The distinct word distribution characteristics are language independent.•We showed Menzerath–Altmann law is the special case of gamma distribution

The empirical law uncovered by Menzerath and formulated by Altmann, known as the Menzerath–Altmann law (henceforth the MA law), reveals the statistical distribution behavior of human language in various organizational levels. Building on previous studies relating organizational regularities in a language, we propose that the distribution of distinct (or different) words in a large text can effectively be described by the MA law. The validity of the proposition is demonstrated by examining two text corpora written in different languages not belonging to the same language family (English and Turkish). The results show not only that distinct word distribution behavior can accurately be predicted by the MA law, but that this result appears to be language-independent. This result is important not only for quantitative linguistic studies, but also may have significance for other naturally occurring organizations that display analogous organizational behavior. We also deliberately demonstrate that the MA law is a special case of the probability function of the generalized gamma distribution.

Related Topics
Physical Sciences and Engineering Mathematics Mathematical Physics
Authors
,