Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
484900 | Procedia Computer Science | 2015 | 7 Pages |
Abstract
This research proposes a novel lexical approach to text categorization in the bio-medical domain. We have proposed LKNN (Lexical KNN) algorithm, in which lexemes (tokens) are used to represent the medical documents. These tokens are used to classify the abstracts by matching them with the standard list of keywords specified as MESH (Medical Subject Headings). It automatically classifies journal articles of medical domain into specific categories. We have used the collection of medical documents, called Ohsumed, as the test data for evaluating the proposed approach. The results show that LKNN outperforms the traditional KNN algorithm in terms of standard F-measure.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)