On a formula for the h-index

Article ID	Journal	Published Year	Pages	File Type
523080	Journal of Informetrics	2015	15 Pages	PDF

Abstract

•We summarize the most commonly used mathematical models for the h index.•A new mathematical model for the h index is proposed.•The new formula is based on the assumption of geometrically distributed citations.•The effectiveness of our approach is illustrated through a case study.

The h-index is a celebrated indicator widely used to assess the quality of researchers and organizations. Empirical studies support the fact that the h-index is well correlated with other simple bibliometric indicators, such as the total number of publications N and the total number of citations C . In this paper we introduce a new formula h˜w=h˜w(N,C,cMAX), as a representative predictive formula that relates functionally h to these aggregate indicators, N, C and the highest citation count cMAX. The formula is based on the ‘specific’ assumption of geometrically distributed citations, but provides a good estimate of the h -index for the general case. To empirically evaluate the adequacy of the fit of the proposed formula h˜w, an empirical study with 131 datasets (13,347 papers; 288,972 citations) was carried out. The overall fit (defined as the capacity of h˜w to reproduce the true value of h, for each single scientist) was remarkably accurate. The predicted value was within one of the actual value h for more than 60% of the datasets. We found, in approximately three cases out of four, an absolute error less than or equal to 2, and an average absolute error of only 1.9, for the whole sample of datasets.

Keywords

Citation data Geometric distribution H-index Lambert W function