Article ID Journal Published Year Pages File Type
484900 Procedia Computer Science 2015 7 Pages PDF
Abstract

This research proposes a novel lexical approach to text categorization in the bio-medical domain. We have proposed LKNN (Lexical KNN) algorithm, in which lexemes (tokens) are used to represent the medical documents. These tokens are used to classify the abstracts by matching them with the standard list of keywords specified as MESH (Medical Subject Headings). It automatically classifies journal articles of medical domain into specific categories. We have used the collection of medical documents, called Ohsumed, as the test data for evaluating the proposed approach. The results show that LKNN outperforms the traditional KNN algorithm in terms of standard F-measure.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)