Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10231885 | Computational Biology and Chemistry | 2014 | 8 Pages |
Abstract
Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for KÂ =Â 3-9 and using raw CVs without subtraction for KÂ =Â 1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for KÂ =Â 5 and 6 with subtraction and for KÂ =Â 11, 12 or even more without subtraction. In general, CVs with subtractions are slightly better and less CPU consuming, but CVs without subtraction may provide complementary information.
Keywords
Related Topics
Physical Sciences and Engineering
Chemical Engineering
Bioengineering
Authors
Guanghong Zuo, Qiang Li, Bailin Hao,