کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4497188 | 1318920 | 2011 | 7 صفحه PDF | دانلود رایگان |
Sequence comparison is one of the major tasks in bioinformatics, which can be used to study structural and functional conservation, as well as evolutionary relations among the sequences. Numerous dissimilarity measures achieve promising results in sequence comparison, but challenges remain. This paper studied numerical characteristics of word frequencies and proposed a novel dissimilarity measure for sequence comparison. Instead of using the word frequencies directly, the proposed measure considers both the word frequencies and overlapping structures of words. To verify the effectiveness of the proposed measure, we tested it with two experiments and further compared it with alignment-based and alignment-free measures. The results demonstrate that the proposed measure extracting more information on the overlapping structures of the words improves the efficiency of sequence comparison.
Journal: Journal of Theoretical Biology - Volume 276, Issue 1, 7 May 2011, Pages 174–180