کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
5918498 | 1570795 | 2016 | 13 صفحه PDF | دانلود رایگان |
- A new phylogenetic tool (PhyPA) using pairwise sequence alignment for highly diverged sequences.
- PhyPA outperforms maximum likelihood (ML) methods for highly diverged sequences.
- PhyPA is fully implemented in DAMBE with a user-friendly user interface.
- PhyPA, ML and MP methods in DAMBE can analyze files containing multiple sets of data.
While pairwise sequence alignment (PSA) by dynamic programming is guaranteed to generate one of the optimal alignments, multiple sequence alignment (MSA) of highly divergent sequences often results in poorly aligned sequences, plaguing all subsequent phylogenetic analysis. One way to avoid this problem is to use only PSA to reconstruct phylogenetic trees, which can only be done with distance-based methods. I compared the accuracy of this new computational approach (named PhyPA for phylogenetics by pairwise alignment) against the maximum likelihood method using MSA (the MLÂ +Â MSA approach), based on nucleotide, amino acid and codon sequences simulated with different topologies and tree lengths. I present a surprising discovery that the fast PhyPA method consistently outperforms the slow MLÂ +Â MSA approach for highly diverged sequences even when all optimization options were turned on for the MLÂ +Â MSA approach. Only when sequences are not highly diverged (i.e., when a reliable MSA can be obtained) does the MLÂ +Â MSA approach outperforms PhyPA. The true topologies are always recovered by ML with the true alignment from the simulation. However, with MSA derived from alignment programs such as MAFFT or MUSCLE, the recovered topology consistently has higher likelihood than that for the true topology. Thus, the failure to recover the true topology by the MLÂ +Â MSA is not because of insufficient search of tree space, but by the distortion of phylogenetic signal by MSA methods. I have implemented in DAMBE PhyPA and two approaches making use of multi-gene data sets to derive phylogenetic support for subtrees equivalent to resampling techniques such as bootstrapping and jackknifing.
Journal: Molecular Phylogenetics and Evolution - Volume 102, September 2016, Pages 331-343