Article ID Journal Published Year Pages File Type
6902214 Procedia Computer Science 2017 8 Pages PDF
Abstract
Information retrieval task has become a difficult task due to the growing size of the web. This demands a simple method for classifying the web pages. We propose an URL based approach, as it avoids downloading the web page contents. Feature weighing methods play an important role in improving the performance of a classifier. In this paper, we explored different weighting methods and conducted various experiments on WebKB dataset. Results show that tf.mi feature weighting technique achieves F1 measure of 79% and outperforms other weighting methods, which is an improvement of 19.6% over existing works on URL based classification.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, ,