Article ID Journal Published Year Pages File Type
518438 Journal of Biomedical Informatics 2013 8 Pages PDF
Abstract

•This paper describes the creation of a temporally annotated clinical corpus.•It is the largest publically available clinical corpus with full temporal information.•This paper informs the MLP researchers about the temporal data preparation process.•This paper guides the development of future clinical temporal annotation guidelines.•It cautions against pitfalls and the issues often raised in the temporal representation.

Temporal information in clinical narratives plays an important role in patients’ diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality.

Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (127 K)Download as PowerPoint slide

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,