کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5906624 1159981 2013 5 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hap-seqX: Expedite algorithm for haplotype phasing with imputation using sequence data
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
پیش نمایش صفحه اول مقاله
Hap-seqX: Expedite algorithm for haplotype phasing with imputation using sequence data
چکیده انگلیسی

Haplotype phasing is one of the most important problems in population genetics as haplotypes can be used to estimate the relatedness of individuals and to impute genotype information which is a commonly performed analysis when searching for variants involved in disease. The problem of haplotype phasing has been well studied. Methodologies for haplotype inference from sequencing data either combine a set of reference haplotypes and collected genotypes using a Hidden Markov Model or assemble haplotypes by overlapping sequencing reads. A recent algorithm Hap-seq considers using both sequencing data and reference haplotypes and it is a hybrid of a dynamic programming algorithm and a Hidden Markov Model (HMM), which is shown to be optimal. However, the algorithm requires extremely large amount of memory which is not practical for whole genome datasets. The current algorithm requires saving intermediate results to disk and reads these results back when needed, which significantly affects the practicality of the algorithm. In this work, we proposed the expedited version of the algorithm Hap-seqX, which addressed the memory issue by using a posterior probability to select the records that should be saved in memory. We show that Hap-seqX can save all the intermediate results in memory and improves the execution time of the algorithm dramatically. Utilizing the strategy, Hap-seqX is able to predict haplotypes from whole genome sequencing data.

► We studied the problem haplotype phasing. ► Our method uses both sequencing data and reference haplotypes. ► Hap‐seqX is a hybrid method of dynamic programming and HMM. ► We used posterior probability to improve the memory efficiency. ► Hap‐seqX is much more memory efficient than Hap‐seq.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Gene - Volume 518, Issue 1, 10 April 2013, Pages 2-6
نویسندگان
, ,