Article ID Journal Published Year Pages File Type
2089905 Journal of Microbiological Methods 2015 8 Pages PDF
Abstract

•We developed a metagenome assembly method based on probabilistic base choices.•Our method utilizes clustered reads by using reference protein sequences.•Evaluations showed that our method is superior to traditional approaches.

The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging.In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising.

Related Topics
Life Sciences Biochemistry, Genetics and Molecular Biology Biotechnology
Authors
, ,