Article ID Journal Published Year Pages File Type
10231885 Computational Biology and Chemistry 2014 8 Pages PDF
Abstract
Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for K = 3-9 and using raw CVs without subtraction for K = 1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for K = 5 and 6 with subtraction and for K = 11, 12 or even more without subtraction. In general, CVs with subtractions are slightly better and less CPU consuming, but CVs without subtraction may provide complementary information.
Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
, , ,