Article ID Journal Published Year Pages File Type
4500769 Mathematical Biosciences 2008 6 Pages PDF
Abstract

In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k  , (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length k⩽3k⩽3. When k>3k>3, our method still works and gets convergent phylogenetic topology but the RE gives degenerate results.

Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, ,