Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6370766 | Journal of Theoretical Biology | 2013 | 10 Pages |
Abstract
Originating from sequences' length difference, both k-word based methods and graphical representation approaches have uncovered biological information in their distinct ways. However, it is less likely that the mechanisms of information storage vary with sequences' length. A similarity distance suitable for sequences with various lengths will be much near to the mechanisms of information storage. In this paper, new sub-sequences of k-word were extracted from biological sequences under a one-to-one mapping. The new sub-sequences were evaluated by a linear regression model. Moreover, a new distance was defined on the invariants from the linear regression model. With comparison to other alignment-free distances, the results of four experiments demonstrated that our similarity distance was more efficient.
Related Topics
Life Sciences
Agricultural and Biological Sciences
Agricultural and Biological Sciences (General)
Authors
Xiwu Yang, Tianming Wang,