کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
15227 1393 2011 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An integer programming approach to DNA sequence assembly
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
پیش نمایش صفحه اول مقاله
An integer programming approach to DNA sequence assembly
چکیده انگلیسی

De novo sequence assembly is a ubiquitous combinatorial problem in all DNA sequencing technologies. In the presence of errors in the experimental data, the assembly problem is computationally challenging, and its solution may not lead to a unique reconstruct. The enumeration of all alternative solutions is important in drawing a reliable conclusion on the target sequence, and is often overlooked in the heuristic approaches that are currently available. In this paper, we develop an integer programming formulation and global optimization solution strategy to solve the sequence assembly problem with errors in the data. We also propose an efficient technique to identify all alternative reconstructs. When applied to examples of sequencing-by-hybridization, our approach dramatically increases the length of DNA sequences that can be handled with global optimality certificate to over 10,000, which is more than 10 times longer than previously reported. For some problem instances, alternative solutions exhibited a wide range of different ability in reproducing the target DNA sequence. Therefore, it is important to utilize the methodology proposed in this paper in order to obtain all alternative solutions to reliably infer the true reconstruct. These alternative solutions can be used to refine the obtained results and guide the design of further experiments to correctly reconstruct the target DNA sequence.

Figure optionsDownload as PowerPoint slideHighlights
► We developed a novel integer programming framework for DNA sequence assembly.
► The formulation effectively identifies biologically meaningful reconstructs.
► The strategy provides all alternative global optima.
► We demonstrate that alternative optima are critical for correct reconstruction.
► The approach reconstructs 10 times longer DNA sequences than previously reported.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 35, Issue 4, 10 August 2011, Pages 251–258
نویسندگان
, ,