Article ID Journal Published Year Pages File Type
1123307 Procedia - Social and Behavioral Sciences 2011 10 Pages PDF
Abstract

We propose a new method of content-based document recommendation using data compression. Though previous studies mainly used bags-of-words to calculate the similarity between the profile and target documents, users in fact focus on larger unit than words, when searching information from documents. In order to take this point into consideration, we propose a method of document recommendation using data compression. Experimental results using Japanese newspaper corpora showed that (a) data compression performed better than the bag-of-words method, especially when the number of topics was large; (b) our new method outperformed the previous data compression method; (c) a combination of data compression and bag-of-words can also improve performance. We conclude that our method better captures users’ profiles and thus contributes to making a better document recommendation system.

Related Topics
Social Sciences and Humanities Arts and Humanities Arts and Humanities (General)