Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10358344 | Journal of Informetrics | 2016 | 12 Pages |
Abstract
Publication keywords have been widely utilized to reveal the knowledge structure of research domains. An important but under-addressed problem is the decision of which keywords should be retained as analysis objects after a great number of keywords are gathered from domain publications. In this paper, we discuss the problems with the traditional term frequency (TF) method and introduce two alternative methods: TF-inverse document frequency (TF-IDF) and TF-Keyword Activity Index (TF-KAI). These two methods take into account keyword discrimination by considering their frequency both in and out of the domain. To test their performance, the keywords they select in China's Digital Library domain are evaluated both qualitatively and quantitatively. The evaluation results show that the TF-KAI method performs the best: it can retain keywords that match expert selection much better and reveal the research specialization of the domain with more details.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
Guo Chen, Lu Xiao,