Article ID Journal Published Year Pages File Type
6368910 Journal of Theoretical Biology 2016 8 Pages PDF
Abstract
Numerous statistical methods have been developed to estimate evolutionary relationships among a collection of present-day species, typically represented by a phylogenetic tree, using the information contained in the DNA sequences sampled from representatives of each species. In the current era of high-throughput genome sequencing, the models underlying such methods have become increasingly sophisticated, and the resulting computations are often prohibitive. Here we consider the problem of rigorously testing the phylogenetic relationships among collections of four species under the multispecies coalescent model that accommodates both multi-locus datasets and SNP data. Our test employs a new statistic - the summed absolute differences between certain columns in flattened phylogenetic matrices - as well as a previously used statistic that measures the distance of a flattened matrix from the space of rank-10 matrices. We derive distributional results for both statistics and study the performance of the corresponding hypothesis tests using both simulated and empirical data. We discuss how these tests may be used to improve inference of phylogenetic relationships for larger samples of species under the multispecies coalescent model, a problem that has until recently been computationally intractable.
Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, ,