Article ID Journal Published Year Pages File Type
1709615 Applied Mathematics Letters 2010 5 Pages PDF
Abstract

When two initially identical binary sequences undergo independent site mutations at a constant rate, the proportion of site differences is often used to estimate the total time TT that separates the two sequences. In this short note we study the posterior distribution of TT when the prior distribution on TT is exponential. We show that posterior estimates of TT (for any data) cannot grow faster than the logarithm of the sequence length, and this rate is achieved for data generated at site saturation (i.e. in the limit as T→∞T→∞). The problem is motivated by information-theoretic questions arising in molecular systematic biology, in which one wishes to use DNA sequences to estimate the divergence time between present-day species.

Related Topics
Physical Sciences and Engineering Engineering Computational Mechanics
Authors
, ,