Article ID Journal Published Year Pages File Type
4500782 Mathematical Biosciences 2009 8 Pages PDF
Abstract

In this paper, we propose two metrics to compare DNA and protein sequences based on a Poisson model of word occurrences. Instead of comparing the frequencies of all fixed-length words in two sequences, we consider (1) the probability of ‘generating’ one sequence under the Poisson model estimated from the other; (2) their different expression levels of words. Phylogenetic trees of 25 viruses including SARS-CoVs are constructed to illustrate our approach.

Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, , ,