کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
383187 | 660807 | 2016 | 12 صفحه PDF | دانلود رایگان |
• We propose a novel technique for finding biclusters in gene expression data.
• We propose a simple yet effective method for automatically determining discriminating biclusters.
• Our proposed method is robust to noise in the data.
• We evaluate the empirical and biological significance of our extracted biclusters with biological processes.
Biclustering of gene expression data aims at finding localized patterns in a subspace. A bicluster (sometimes called a co-cluster), in the context of gene expression data, is a set of genes that exhibit similar expression intensity under a subset of experimental features (conditions). Most biclustering algorithms proposed in the literature aim at finding sub-matrices that exhibit some sort of coherence by selecting an initial sub-matrix and iteratively adding or subtracting rows and columns. These algorithms are generally dependent on the initial, hard selection of the gene and condition clusters respectively. In this work, we adapt a recently proposed approach for clustering textual data to find biclusters in gene expression data. Our proposed technique is based on the concept of co-similarity between genes (and between conditions) that exploits weighted higher order paths in a bipartite graph representation of the gene expression data. Therefore, we build statistical relations between genes and between conditions by comparing all genes and conditions before finally extracting biclusters from the data. We show that the proposed technique is able to find meaningful non-overlapping biclusters both on synthetically generated data as well as real cancer data. Our results indicate that the proposed technique is resistant to noise in the data and can successfully retrieve biclusters even in the presence of relatively large amount of noise. We also analyze our results with respect to the discovered genes and observe that our extracted biclusters are supported by biological evidences, such as enrichment of gene functions and biological processes.
Journal: Expert Systems with Applications - Volume 55, 15 August 2016, Pages 520–531