| Article ID | Journal | Published Year | Pages | File Type | 
|---|---|---|---|---|
| 517112 | Journal of Biomedical Informatics | 2014 | 16 Pages | 
•Analysis of measures used to estimate the similarity of biomedical concepts.•Definition of a theoretical framework unifying several semantic measure paradigms.•Identification of the core elements commonly used for semantic similarity design.•Benefits for studying, defining and improving semantic measures are highlighted.•Comparison of hundreds of semantic measures using SNOMED-CT healthcare terminology.
Ontologies are widely adopted in the biomedical domain to characterize various resources (e.g. diseases, drugs, scientific publications) with non-ambiguous meanings. By exploiting the structured knowledge that ontologies provide, a plethora of ad hoc and domain-specific semantic similarity measures have been defined over the last years. Nevertheless, some critical questions remain: which measure should be defined/chosen for a concrete application? Are some of the, a priori different, measures indeed equivalent? In order to bring some light to these questions, we perform an in-depth analysis of existing ontology-based measures to identify the core elements of semantic similarity assessment. As a result, this paper presents a unifying framework that aims to improve the understanding of semantic measures, to highlight their equivalences and to propose bridges between their theoretical bases. By demonstrating that groups of measures are just particular instantiations of parameterized functions, we unify a large number of state-of-the-art semantic similarity measures through common expressions. The application of the proposed framework and its practical usefulness is underlined by an empirical analysis of hundreds of semantic measures in a biomedical context.
Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (103 K)Download as PowerPoint slide
