کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
9471037 1320058 2005 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multi-class clustering and prediction in the analysis of microarray data
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
Multi-class clustering and prediction in the analysis of microarray data
چکیده انگلیسی
DNA microarray technology provides tools for studying the expression profiles of a large number of distinct genes simultaneously. This technology has been applied to sample clustering and sample prediction. Because of a large number of genes measured, many of the genes in the original data set are irrelevant to the analysis. Selection of discriminatory genes is critical to the accuracy of clustering and prediction. This paper considers statistical significance testing approach to selecting discriminatory gene sets for multi-class clustering and prediction of experimental samples. A toxicogenomic data set with nine treatments (a control and eight metals, As, Cd, Ni, Cr, Sb, Pb, Cu, and AsV with a total of 55 samples) is used to illustrate a general framework of the approach. Among four selected gene sets, a gene set ΩI formed by the intersection of the F-test and the set of the union of one-versus-all t-tests performs the best in terms of clustering as well as prediction. Hierarchical and two modified partition (k-means) methods all show that the set ΩI is able to group the 55 samples into seven clusters reasonably well, in which the As and AsV samples are considered as one cluster (the same group) as are the Cd and Cu samples. With respect to prediction, the overall accuracy for the gene set ΩI using the nearest neighbors algorithm to predict 55 samples into one of the nine treatments is 85%.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Mathematical Biosciences - Volume 193, Issue 1, January 2005, Pages 79-100
نویسندگان
, , , , , ,