Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5590069 | Genomics | 2017 | 24 Pages |
Abstract
Massive data produced due to the advent of next-generation sequencing (NGS) technology is widely used for biological researches and medical diagnosis. The crucial step in NGS analysis is read alignment or mapping which is computationally intensive and complex. The mapping bias tends to affect the downstream analysis, including detection of polymorphisms. In order to provide guidelines to the biologist for suitable selection of aligners; we have evaluated and benchmarked 5 different aligners (BWA, Bowtie2, NovoAlign, Smalt and Stampy) and their mapping bias based on characteristics of 5 microbial genomes. Two million simulated read pairs of various sizes (36Â bp, 50Â bp, 72Â bp, 100Â bp, 125Â bp, 150Â bp, 200Â bp, 250Â bp and 300Â bp) were aligned. Specific alignment features such as sensitivity of mapping, percentage of properly paired reads, alignment time and effect of tandem repeats on incorrectly mapped reads were evaluated. BWA showed faster alignment followed by Bowtie2 and Smalt. NovoAlign and Stampy were comparatively slower. Most of the aligners showed high sensitivity towards long reads (>Â 100Â bp) mapping. On the other hand NovoAlign showed higher sensitivity towards both short reads (36Â bp, 50Â bp, 72Â bp) and long reads (>Â 100Â bp) mappings; It also showed higher sensitivity towards mapping a complex genome like Plasmodium falciparum. The percentage of properly paired reads aligned by NovoAlign, BWA and Stampy were markedly higher. None of the aligners outperforms the others in the benchmark, however the aligners perform differently with genome characteristics. We expect that the results from this study will be useful for the end user to choose aligner, thus enhance the accuracy of read mapping.
Related Topics
Life Sciences
Biochemistry, Genetics and Molecular Biology
Genetics
Authors
Subazini Thankaswamy-Kosalai, Partho Sen, Intawat Nookaew,