کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530486 869770 2010 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Clustering of temporal gene expression data by regularized spline regression and an energy based similarity measure
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Clustering of temporal gene expression data by regularized spline regression and an energy based similarity measure
چکیده انگلیسی

Clustering analysis of temporal gene expression data is widely used to study dynamic biological systems, such as identifying sets of genes that are regulated by the same mechanism. However, most temporal gene expression data often contain noise, missing data points, and non-uniformly sampled time points, which imposes challenges for traditional clustering methods of extracting meaningful information. In this paper, we introduce an improved clustering approach based on the regularized spline regression and an energy based similarity measure. The proposed approach models each gene expression profile as a B-spline expansion, for which the spline coefficients are estimated by regularized least squares scheme on the observed data. To compensate the inadequate information from noisy and short gene expression data, we use its correlated genes as the test set to choose the optimal number of basis and the regularization parameter. We show that this treatment can help to avoid over-fitting. After fitting the continuous representations of gene expression profiles, we use an energy based similarity measure for clustering. The energy based measure can include the temporal information and relative changes of the time series using the first and second derivatives of the time series. We demonstrate that our method is robust to noise and can produce meaningful clustering results.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 43, Issue 12, December 2010, Pages 3969–3976
نویسندگان
, , ,