کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
421893 | 684985 | 2009 | 15 صفحه PDF | دانلود رایگان |

This paper presents algorithms for topic analysis of news articles. Topic analysis entails category classification and topic discovery and classification. Dealing with news has special requirements that standard classification approaches typically cannot handle. The algorithms proposed in this paper are able to do online training for both category and topic classification as well as discover new topics as they arise. Both algorithms are based on a keyword extraction algorithm that is applicable to any language that has basic morphological analysis tools. As such, both the category classification and topic discovery and classification algorithms can be easily used by multiple languages. Through experimentation the algorithms are shown to have high precision and recall in tests on English and Japanese.
Journal: Electronic Notes in Theoretical Computer Science - Volume 225, 2 January 2009, Pages 51-65