Article ID Journal Published Year Pages File Type
517026 Journal of Biomedical Informatics 2016 8 Pages PDF
Abstract

•New distance measure between sets where set items are arranged in concept hierarchy.•Natural extension of Jaccard distance to include a concept hierarchy.•Improved clustering results compared with traditional approaches.

ObjectiveWe introduce a new distance measure that is better suited than traditional methods at detecting similarities in patient records by referring to a concept hierarchy.Materials and methodsThe new distance measure improves on distance measures for categorical values by taking the path distance between concepts in a hierarchy into account. We evaluate and compare the new measure on a data set of 836 patients.ResultsThe new measure shows marked improvements over the standard measures, both qualitatively and quantitatively. Using the new measure for clustering patient data reveals structure that is otherwise not visible. Statistical comparisons of distances within patient groups with similar diagnoses shows that the new measure is significantly better at detecting these similarities than the standard measures.ConclusionThe new distance measure is an improvement over the current standard whenever a hierarchical arrangement of categorical values is available.

Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (164 K)Download as PowerPoint slide

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , , , , ,