Article ID Journal Published Year Pages File Type
5920189 Molecular Phylogenetics and Evolution 2013 12 Pages PDF
Abstract

The number of sequences from both formally described taxa and uncultured environmental DNA deposited in the International Nucleotide Sequence Databases has increased substantially over the last two decades. Although the majority of these sequences represent authentic gene copies, there is evidence of DNA artifacts in these databases as well. These include lab artifacts, such as PCR chimeras, and biological artifacts such as pseudogenes or other paralogous sequences. Sequences that fall in basal positions in phylogenetic trees and appear distant from known sequences are particularly suspect. Phylogenetic analyses suggest that a novel sequence type (NS1) found in two boreal forest soil clone libraries belongs to the fungal kingdom but does not fall unambiguously within any known phylum. We have evaluated this sequence type using an array of secondary-structure analyses. To our knowledge, such analyses have never been used on environmental ribosomal sequences. Ribosomal secondary structure was modeled for four rRNA loci (ITS1, 5.8S, ITS2, 5′ LSU). These models were analyzed for the presence of conserved domains, conserved nucleotide motifs, and compensatory base changes. Minimal free energy (MFE) foldings and GC contents of sequences representing the major fungal clades, as well as NS1, were also compared. NS1 displays secondary rRNA structures consistent with other fungi and many, but not all, conserved nucleotide motifs found across eukaryotes. However, our analyses show that many other authentic sequences from basal fungi lack more of these conserved motifs than does NS1. Together our findings suggest that NS1 represents an authentic gene copy. The methods described here can be used on any rRNA-coding sequence, not just environmental fungal sequences. As new-generation sequencing methods that yield shorter sequences become more widely implemented, methods that evaluate sequence authenticity should also be more widely implemented. For fungi, the adjacent 5.8S and ITS2 loci should be prioritized. This region is not only suited to distinguishing between closely related species, but it is also more informative in terms of expected secondary structure.

Graphical abstractDownload full-size imageHighlights► We found a sequence from a potentially novel fungus in Alaskan boreal forest soils. ► Extreme sequence divergence makes phylogenetic placement and authenticity uncertain. ► Ribosomal RNA secondary structure models are consistent with pan-eukaryotic patterns. ► Statistical analyses suggest that the sequence represents an authentic gene copy. ► If it is an authentic gene-copy it could represent a new subphylum or class of fungi.

Related Topics
Life Sciences Agricultural and Biological Sciences Ecology, Evolution, Behavior and Systematics
Authors
, , , ,