Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
493261 | Procedia Technology | 2012 | 8 Pages |
Abstract
Named entities are the most informative element of a textual document and identification of the names is very much important for extracting further information from text. We have developed a conditional random field based system to identify the named entities from homeopathic diagnosis discussion forum text. We have manually annotated a training corpus for the task. As manual creation of a sufficiently large annotated corpus is costly and time consuming, we use an active learning based semi-supervised framework to increase the efficiency of the system with the help of un-annotated data. Our system achieves the highest f-value of 84.35.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)