Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5760184 | Journal of Theoretical Biology | 2017 | 18 Pages |
Abstract
Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with exp(4Nem(1â1/(2N))), where m is Malthusian fitness and Ne and N are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with exp(âÏN), where the evolutionary statistical energy ÏN is represented as the sum of one body (h) (compositional) and pairwise (J) (covariational) interactions over all sites and site pairs. A protein folding theory based on the random energy model (REM) indicates that the equilibrium ensemble of natural protein sequences is well represented by a canonical ensemble characterized by exp(âÎGND/kBTs) or by exp(âGN/kBTs) if an amino acid composition is kept constant, where ÎGNDâ¡GNâGD, GN and GD are the native and denatured free energies, and Ts is the effective temperature representing the strength of selection pressure. Thus, 4Nem(1â1/(2N)),âÎÏND(â¡âÏN+ÏD), and âÎGND/kBTs must be equivalent to each other. With h and J estimated by the DCA program, the changes (ÎÏN) of ÏN due to single nucleotide nonsynonymous substitutions are analyzed. The results indicate that the standard deviation of ÎGN(=kBTsÎÏN) is approximately constant irrespective of protein families, and therefore can be used to estimate the relative value of Ts. Glass transition temperature Tg and ÎGND are estimated from estimated Ts and experimental melting temperature (Tm) for 14 protein domains. The estimates of ÎGND agree with their experimental values for 5 proteins, and those of Ts and Tg are all within a reasonable range. In addition, approximating the probability density function (PDF) of ÎÏN by a log-normal distribution, PDFs of ÎÏN and Ka/Ks, which is the ratio of nonsynonymous to synonymous substitution rate per site, in all and in fixed mutants are estimated. The equilibrium values of ÏN, at which the average of ÎÏ in fixed mutants is equal to zero, well match ÏN averaged over homologous sequences, confirming that the present methods for a fixation process of mutations and for the equilibrium ensemble of ÏN give a consistent result with each other. The PDFs of Ka/Ks at equilibrium confirm that Ts negatively correlates with the amino acid substitution rate (the mean of Ka/Ks) of protein. Interestingly, stabilizing mutations are significantly fixed by positive selection, and balance with destabilizing mutations fixed by random drift, although most of them are removed from population. Supporting the nearly neutral theory, neutral selection is not significant even in fixed mutants.
Related Topics
Life Sciences
Agricultural and Biological Sciences
Agricultural and Biological Sciences (General)
Authors
Sanzo Miyazawa,