Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
505466 | Computers in Biology and Medicine | 2011 | 5 Pages |
Abstract
Citations are used ubiquitously in biomedical full-text articles and play an important role for representing both the rhetorical structure and the semantic content of the articles. As a result, text mining systems will significantly benefit from a tool that automatically extracts the content of a citation. In this study, we applied the supervised machine-learning algorithms Conditional Random Fields (CRFs) to automatically parse a citation into its fields (e.g., Author, Title, Journal, and Year). With a subset of html format open-access PubMed Central articles, we report an overall 97.95% F1-score. The citation parser can be accessed at: http://www.cs.uwm.edu/∼qing/projects/cithit/index.html.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
Qing Zhang, Yong-Gang Cao, Hong Yu,