Article ID Journal Published Year Pages File Type
523359 Journal of Informetrics 2016 11 Pages PDF
Abstract

•The hooked power law fits citation data from a single subject better than the discretised lognormal distribution in science.•The discretised lognormal distribution fits citation from a single subject better than the hooked power law outside science.•After a transformation, normal distribution parameters are more stable than discrete distribution parameters for citation data.

Identifying the statistical distribution that best fits citation data is important to allow robust and powerful quantitative analyses. Whilst previous studies have suggested that both the hooked power law and discretised lognormal distributions fit better than the power law and negative binomial distributions, no comparisons so far have covered all articles within a discipline, including those that are uncited. Based on an analysis of 26 different Scopus subject areas in seven different years, this article reports comparisons of the discretised lognormal and the hooked power law with citation data, adding 1 to citation counts in order to include zeros. The hooked power law fits better in two thirds of the subject/year combinations tested for journal articles that are at least three years old, including most medical, life and natural sciences, and for virtually all subject areas for younger articles. Conversely, the discretised lognormal tends to fit best for arts, humanities, social science and engineering fields. The difference between the fits of the distributions is mostly small, however, and so either could reasonably be used for modelling citation data. For regression analyses the best option is to use ordinary least squares regression applied to the natural logarithm of citation counts plus one, especially for sets of younger articles, because of the increased precision of the parameters.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
,