کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
391598 661891 2015 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
BulkAligner: A novel sequence alignment algorithm based on graph theory and Trinity
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
BulkAligner: A novel sequence alignment algorithm based on graph theory and Trinity
چکیده انگلیسی

Sequence alignment is a widely-used tool in genomics. With the development of next generation sequencing (NGS) technology, the production of sequence read data has recently increased. A number of read alignment algorithms for handling NGS data have been developed. However, these algorithms suffer from a trade-off between the throughput and alignment quality, due to the large computational costs for processing repeat reads. Conversely, alignment algorithms with distributed systems such as Hadoop and Trinity can obtain a better throughput than existing algorithms on single machine without compromising the alignment quality. In this paper, we suggest BulkAligner, a novel sequence alignment algorithm on the graph-based in-memory distributed system Trinity. We covert the original reference sequence into graph form and perform sequence alignment by finding the longest paths on the graph. Our experimental results show that BulkAligner has at least an 1.8× and up to 57× better throughput with the same, or higher quality than existing algorithms with Hadoop. We analyze the scalability and show that we can obtain a better throughput by simply adding machines.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 303, 10 May 2015, Pages 120–133
نویسندگان
, , , , ,