کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5183 347 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Assembly of repetitive regions using next-generation sequencing data
ترجمه فارسی عنوان
مونتاژ مناطق تکراری با استفاده از داده های توالی نسل بعدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
چکیده انگلیسی

High read depth can be used to assemble short sequence repeats. The existing genome assemblers fail in repetitive regions of longer than average read.I propose a new algorithm for a DNA assembly which uses the relative frequency of reads to properly reconstruct repetitive sequences. The mathematical model for error-free input data shows the upper limits of accuracy of the results as a function of read coverage. For high coverage, the estimation error depends linearly on repetitive sequence length and inversely proportional to the sequencing coverage. The model depicts, the smaller de Bruijn graph dimensions, the more accurate assembly of long repetitive regions.The algorithm requires high read depth, provided by the next-generation sequencers and could use the existing data. The tests on errorless reads, generated in silico from several model genomes, pointed the properly reconstructed repetitive sequences, where existing assemblers fail.The C++ sources, the Python scripts and the additional data are available at http://dnaasm.sourceforge.net.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Biocybernetics and Biomedical Engineering - Volume 35, Issue 4, 2015, Pages 276–283
نویسندگان
,