کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4496160 1623860 2014 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly
چکیده انگلیسی


• New probabilistic models for haplotype assembly.
• Based on the maximum likelihood paradigm using minor allele frequencies.
• A theoretical support for the minimum error correction model.
• A weighted Max-SAT formulation for a simplified model.
• Accuracy improvement confirmed by experimental results.

Human haplotypes include essential information about SNPs, which in turn provide valuable information for such studies as finding relationships between some diseases and their potential genetic causes, e.g., for Genome Wide Association Studies. Due to expensiveness of directly determining haplotypes and recent progress in high throughput sequencing, there has been an increasing motivation for haplotype assembly, which is the problem of finding a pair of haplotypes from a set of aligned fragments. Although the problem has been extensively studied and a number of algorithms have already been proposed for the problem, more accurate methods are still beneficial because of high importance of the haplotypes information. In this paper, first, we develop a probabilistic model, that incorporates the Minor Allele Frequency (MAF) of SNP sites, which is missed in the existing maximum likelihood models. Then, we show that the probabilistic model will reduce to the Minimum Error Correction (MEC) model when the information of MAF is omitted and some approximations are made. This result provides a novel theoretical support for the MEC, despite some criticisms against it in the recent literature. Next, under the same approximations, we simplify the model to an extension of the MEC in which the information of MAF is used. Finally, we extend the haplotype assembly algorithm HapSAT by developing a weighted Max-SAT formulation for the simplified model, which is evaluated empirically with positive results.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Theoretical Biology - Volume 350, 7 June 2014, Pages 49–56
نویسندگان
, , , , ,