Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10351538 | Computers in Biology and Medicine | 2012 | 7 Pages |
Abstract
According to the repetition structure patterns of single-nucleotides, we propose a novel digital representation method to characterize primary DNA sequences. Based on this representation we give a new RP-SP (repeat and space) vector to compute the distance of different sequences. The examination of similarities/dissimilarities among different sequences illustrates the utility of the proposed RP-SP vector distance. Then, we use the proposed RP-SP vector method to analyze two groups of genomes, 15 E. coli genomes and 31 mitochondrial genomes. For comparison, we also apply other alignment-free methods to the two groups of genomes. The results show that the proposed method can distinguish characteristics of different genomes and used to reconstruct the phylogenetic tree of different genomes.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
Zhao-Hui Qi, Ming-Hui Du, Xiao-Qin Qi, Li-Juan Zheng,