کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523396 868344 2014 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Regression for citation data: An evaluation of different methods
ترجمه فارسی عنوان
رگرسیون داده های استنادی: ارزیابی روش های مختلف
کلمات کلیدی
معلومات، آلترکتیک توزیع استناد، غیر طبیعی قانون قدرت، پسرفت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• Ordinary least squares regression is recommended for citation data +1 after a logistic transformation.
• The generalised linear model with lognormal residuals is recommended for citation data.
• Inappropriate regression models can substantially inflate the chance of detecting false factors within citation data.
• Regression models are evaluated for citation data and clear recommendations made for the best ones.

Citations are increasingly used for research evaluations. It is therefore important to identify factors affecting citation scores that are unrelated to scholarly quality or usefulness so that these can be taken into account. Regression is the most powerful statistical technique to identify these factors and hence it is important to identify the best regression strategy for citation data. Citation counts tend to follow a discrete lognormal distribution and, in the absence of alternatives, have been investigated with negative binomial regression. Using simulated discrete lognormal data (continuous lognormal data rounded to the nearest integer) this article shows that a better strategy is to add one to the citations, take their log and then use the general linear (ordinary least squares) model for regression (e.g., multiple linear regression, ANOVA), or to use the generalised linear model without the log. Reasonable results can also be obtained if all the zero citations are discarded, the log is taken of the remaining citation counts and then the general linear model is used, or if the generalised linear model is used with the continuous lognormal distribution. Similar approaches are recommended for altmetric data, if it proves to be lognormally distributed.

Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Informetrics - Volume 8, Issue 4, October 2014, Pages 963–971
نویسندگان
, ,