کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4497188 1318920 2011 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison
چکیده انگلیسی

Sequence comparison is one of the major tasks in bioinformatics, which can be used to study structural and functional conservation, as well as evolutionary relations among the sequences. Numerous dissimilarity measures achieve promising results in sequence comparison, but challenges remain. This paper studied numerical characteristics of word frequencies and proposed a novel dissimilarity measure for sequence comparison. Instead of using the word frequencies directly, the proposed measure considers both the word frequencies and overlapping structures of words. To verify the effectiveness of the proposed measure, we tested it with two experiments and further compared it with alignment-based and alignment-free measures. The results demonstrate that the proposed measure extracting more information on the overlapping structures of the words improves the efficiency of sequence comparison.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Theoretical Biology - Volume 276, Issue 1, 7 May 2011, Pages 174–180
نویسندگان
, , , ,