Article ID Journal Published Year Pages File Type
5919350 Molecular Phylogenetics and Evolution 2014 8 Pages PDF
Abstract

•Locus quality is as important as quantity in determining species-tree accuracy.•When locus variability is low, more loci are needed to achieve equivalent accuracy in *BEAST.•Incorrect but strongly-supported loci can be countered with multiple low-variation loci.•Methods that take gene trees as input are unable to effectively utilize low-variation signal.•Locus variability limits accuracy for both traditional and NGS loci at shallow divergences.

Although species-tree methods have been widely adopted for multi-locus data, little consideration has been given to the source and character of the loci used in these approaches. Decisions about which loci to target in empirical studies are typically constrained by availability, technology and funds – characteristics that are not typically considered in simulation studies. As a result, most real-world datasets often combine one or two variable loci (such as mtDNA or chloroplast loci) with multiple lower-variation loci to estimate species trees. These locus selections impact the accuracy and the resolution of a phylogeny. Furthermore, the fact that using a larger sample of loci can result in lower posterior probabilities has been used as an excuse to drop loci from an analysis. Here we address these issues directly through a simulation approach designed to mimic situations arising in empirical datasets by combining loci with differing mutation rates. We show that low-variation loci can be utilized in species-tree analyses that account for gene-tree uncertainty (e.g., a Bayesian framework), whereas maximum likelihood approaches show no improvement in accuracy when low-variation loci are added. We demonstrate that limited phylogenetic signal associated with low-variation loci constrains gains in species-tree estimation accuracy when adding loci. Lastly, we demonstrate that the inclusion of only a handful of loci with higher mutation rates, and hence greater phylogenetic information content, can make a tremendous difference in the accuracy of species-tree estimates, suggesting that empiricists should consider the quality, and not just quantity, of loci in multi-locus phylogenetic analyses.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide

Related Topics
Life Sciences Agricultural and Biological Sciences Ecology, Evolution, Behavior and Systematics