کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
415204 681188 2009 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A permutation test for determining significance of clusters with applications to spatial and gene expression data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
A permutation test for determining significance of clusters with applications to spatial and gene expression data
چکیده انگلیسی

Hierarchical clustering is a common procedure for identifying structure in a dataset, and this is frequently used for organizing genomic data. Although more advanced clustering algorithms are available, the simplicity and visual appeal of hierarchical clustering have made it ubiquitous in gene expression data analysis. Hence, even minor improvements in this framework would have significant impact. There is currently no simple and systematic way of assessing and displaying the significance of various clusters in a resulting dendrogram without making certain distributional assumptions or ignoring gene-specific variances. In this work, we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The pp-values thus obtained provide insight into the significance of each cluster division. Given these values, one can also modify the dendrogram by combining non-significant branches. By adjusting the cut-off level of significance for branches, one can produce dendrograms with a desired level of detail for ease of interpretation. We demonstrate the usefulness of this approach by applying it to illustrative datasets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 53, Issue 12, 1 October 2009, Pages 4290–4300
نویسندگان
, , , ,