کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
2054139 | 1075606 | 2010 | 10 صفحه PDF | دانلود رایگان |
When comparing environmental sequences with fully identified reference sequences, a common practice has been to rely on threshold values for sequence similarity. We develop a modelling approach that utilizes the self-consistency of the reference database to transfer sequence similarity to the probability of correct identification to a given taxonomic level. We model separately the probability of the focal species being in the reference database, and the probability that the best BLAST hit is correct, conditional on the species being in the reference database. We illustrate our approach in the context of 454 sequencing data on dead wood-inhabiting fungi, with a reference database containing 2 262 ITS-sequences of 1 145 species. We compare the species communities observed by 454 pyrosequencing, DGGE fingerprinting and fruit-body inventory. High-throughput sequencing calls for automated species identification with adequate assessment of identification error. Our results highlight that this is possible if a high-quality reference database with broad coverage is available.
Journal: Fungal Ecology - Volume 3, Issue 4, November 2010, Pages 274–283