Article ID Journal Published Year Pages File Type
8646570 Infection, Genetics and Evolution 2018 6 Pages PDF
Abstract
Public health researchers are often tasked with accurately and quickly identifying the location and time when an epidemic originated from a representative sample of nucleotide sequences. In this paper, we investigate multiple approaches to subsampling the sequence set when employing a Bayesian phylogeographic generalized linear model. Our results indicate that near-categorical posterior MCC estimates on the root can be obtained with replicate runs using 25-50% of the sequence data, and that including 90% of sequences does not necessarily entail more accurate inferences. We present the first analysis of predictor signal suppression and show how the ability to detect the influence of predictor variables is limited when sample size predictors are included in the models.
Related Topics
Life Sciences Agricultural and Biological Sciences Ecology, Evolution, Behavior and Systematics
Authors
, ,