Article ID Journal Published Year Pages File Type
4966774 Journal of Biomedical Informatics 2017 10 Pages PDF
Abstract

•We propose a system to predict coronary artery disease from clinical narratives.•We employ an ontology-guided approach to feature extraction.•The system achieves state-of-the art performance of 77.4% F1-score.

Coronary Artery Disease (CAD) is not only the most common form of heart disease, but also the leading cause of death in both men and women (Coronary Artery Disease: MedlinePlus, 2015). We present a system that is able to automatically predict whether patients develop coronary artery disease based on their narrative medical histories, i.e., clinical free text. Although the free text in medical records has been used in several studies for identifying risk factors of coronary artery disease, to the best of our knowledge our work marks the first attempt at automatically predicting development of CAD. We tackle this task on a small corpus of diabetic patients. The size of this corpus makes it important to limit the number of features in order to avoid overfitting. We propose an ontology-guided approach to feature extraction, and compare it with two classic feature selection techniques. Our system achieves state-of-the-art performance of 77.4% F1 score.

Graphical abstractDownload high-res image (99KB)Download full-size image

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,