Article ID Journal Published Year Pages File Type
517240 Journal of Biomedical Informatics 2013 7 Pages PDF
Abstract

ObjectiveTo identify Common Data Elements (CDEs) in eligibility criteria of multiple clinical trials studying the same disease using a human–computer collaborative approach.DesignA set of free-text eligibility criteria from clinical trials on two representative diseases, breast cancer and cardiovascular diseases, was sampled to identify disease-specific eligibility criteria CDEs. In this proposed approach, a semantic annotator is used to recognize Unified Medical Language Systems (UMLSs) terms within the eligibility criteria text. The Apriori algorithm is applied to mine frequent disease-specific UMLS terms, which are then filtered by a list of preferred UMLS semantic types, grouped by similarity based on the Dice coefficient, and, finally, manually reviewed.MeasurementsStandard precision, recall, and F-score of the CDEs recommended by the proposed approach were measured with respect to manually identified CDEs.ResultsAverage precision and recall of the recommended CDEs for the two diseases were 0.823 and 0.797, respectively, leading to an average F-score of 0.810. In addition, the machine-powered CDEs covered 80% of the cardiovascular CDEs published by The American Heart Association and assigned by human experts.ConclusionIt is feasible and effort saving to use a human–computer collaborative approach to augment domain experts for identifying disease-specific CDEs from free-text clinical trial eligibility criteria.

Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (104 K)Download as PowerPoint slideHighlights► A human–computer collaborative method augments domain expertise in CDE identification. ► This method identifies 86% of CDEs published by The American Heart Association. ► UMLS semantic types are useful for selecting relevant CDEs.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,