Article ID Journal Published Year Pages File Type
518402 Journal of Biomedical Informatics 2013 8 Pages PDF
Abstract

The rapidly growing availability of electronic biomedical data has increased the need for innovative data mining methods. Clustering in particular has been an active area of research in many different application areas, with existing clustering algorithms mostly focusing on one modality or representation of the data. Complementary ensemble clustering (CEC) is a recently introduced framework in which Kmeans is applied to a weighted, linear combination of the coassociation matrices obtained from separate ensemble clustering of different data modalities. The strength of CEC is its extraction of information from multiple aspects of the data when forming the final clusters. This study assesses the utility of CEC in biomedical data, which often have multiple data modalities, e.g., text and images, by applying CEC to two distinct biomedical datasets (PubMed images and radiology reports) that each have two modalities. Referent to five different clustering approaches based on the Kmeans algorithm, CEC exhibited equal or better performance in the metrics of micro-averaged precision and Normalized Mutual Information across both datasets. The reference methods included clustering of single modalities as well as ensemble clustering of separate and merged data modalities. Our experimental results suggest that CEC is equivalent or more efficient than comparable Kmeans based clustering methods using either single or merged data modalities.

Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (151 K)Download as PowerPoint slideHighlights► An enhancement to our previous proposed method “Complementary ensemble clustering”. ► Our method combines multiple data modalities from the medical domain to improve clustering. ► Applied to PubMed images and their captions our method yielded better clusters. ► Applied to radiology reports and UMLS concepts our method yielded better clusters.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , , , , , ,