کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4499824 1624001 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
MGP-HMM: Detecting genome-wide CNVs using an HMM for modeling mate pair insertion sizes and read counts
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
MGP-HMM: Detecting genome-wide CNVs using an HMM for modeling mate pair insertion sizes and read counts
چکیده انگلیسی


• MGP-HMM detects CNVs by discriminating tandem and non-tandem duplications.
• Mixture Gaussian densities model variations in mate pair insertion size and direction.
• A position-specific parametric modeling is applied to estimate each CNV length.

MotivationAssociation of Copy Number Variation (CNV) with schizophrenia, autism, developmental disabilities and fatal diseases such as cancer is verified. Recent developments in Next Generation Sequencing (NGS) have facilitated the CNV studies. However, many of the current CNV detection tools are not capable of discriminating tandem duplication from non-tandem duplications.ResultsIn this study, we propose MGP-HMM as a tool which besides detecting genome-wide deletions discriminates tandem duplications from non-tandem duplications. MGP-HMM takes mate pair abnormalities into account and predicts the digitized number of tandem or non-tandem copies. Abnormalities in the mate pair directions and insertion sizes, after being mapped to the reference genome, are elucidated using a Hidden Markov Model (HMM). For this purpose, a Mixture Gaussian density with time-dependent parameters is applied for emitting mate pair insertion sizes from HMM states.Indeed, depending on observed abnormalities in mate pair insertion size or its orientation, each component in the mixture density will have different parameters. MGP-HMM also applies a Poisson distribution for modeling read depth data. This parametric modeling of the mate pair reads enables us to estimate the length of CNVs precisely, which is an advantage over methods which rely only on read depth approach for the CNV detection. Hidden state of the proposed HMM is the digitized copy number of a genomic segment and states correspond to the multipliers of the mixture Gaussian components. The accuracy of our model is validated on a set of next generation sequencing real and simulated data and is compared to other tools.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Mathematical Biosciences - Volume 279, September 2016, Pages 53–62
نویسندگان
, , ,