Clustering of RNA-Seq samples: Comparison study on cancer data

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
8340153	1541196	2018	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

RNA-seq Cluster analysis - آنالیز خوشه ای Gene expression - بیان ژن Clustering - خوشه بندی Cancer - سرطان

موضوعات مرتبط

علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شیمی

پیش نمایش صفحه اول مقاله

Clustering of RNA-Seq samples: Comparison study on cancer data

چکیده انگلیسی

RNA-Seq is becoming the standard technology for large-scale gene expression level measurements, as it offers a number of advantages over microarrays. Standards for RNA-Seq data analysis are, however, in its infancy when compared to those of microarrays. Clustering, which is essential for understanding gene expression data, has been widely investigated w.r.t. microarrays. In what concerns the clustering of RNA-Seq data, however, a number of questions remain open, resulting in a lack of guidelines to practitioners. Here we evaluate computational steps relevant for clustering cancer samples via an empirical analysis of 15Â mRNA-seq datasets. Our evaluation considers strategies regarding expression estimates, number of genes after non-specific filtering and data transformations. We evaluate the performance of four clustering algorithms and twelve distance measures, which are commonly used for gene expression analysis. Results support that clustering cancer samples based on a gene quantification should be preferred. The use of non-specific filtering leading to a small number of features (1,000) presents, in general, superior results. Data should be log-transformed previously to cluster analysis. Regarding the choice of clustering algorithms, Average-Linkage and k-medoids provide, in general, superior recoveries. Although specific cases can benefit from a careful selection of a distance measure, Symmetric Rank-Magnitude correlation provides consistent and sound results in different scenarios.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Methods - Volume 132, 1 January 2018, Pages 42-49

نویسندگان

Pablo Andretta Jaskowiak, Ivan G. Costa, Ricardo J.G.B. Campello,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Clustering of RNA-Seq samples: Comparison study on cancer data

دسترسی سریع

ارتباط

English Website