کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2076223 1544992 2011 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A mathematical consideration of the word-composition vector method in comparison of biological sequences
موضوعات مرتبط
مهندسی و علوم پایه ریاضیات مدل‌سازی و شبیه سازی
پیش نمایش صفحه اول مقاله
A mathematical consideration of the word-composition vector method in comparison of biological sequences
چکیده انگلیسی

To measure the similarity or dissimilarity between two given biological sequences, several papers proposed metrics based on the “word-composition vector”. The essence of these metrics is as follows. First, we count the appearance frequencies of all the K-tuple words throughout each of two given sequences. Then, the two given sequences are transformed into their respective word-composition vectors. Next, the distance metrics, for example the angle between the two vectors, are calculated. A significant issue is to determine the optimal word size K. With a mathematical model of mutational events (including substitutions, insertions, deletions and duplications) that occur in sequences, we analyzed how the angle between the composition vectors depends on the mutational events. We also considered the optimal word size (=resolution) from our original approach. Our results were verified by computational experiments using artificially generated sequences, amino acid sequences of hemoglobin and nucleotide sequences of 16S ribosomal RNA.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Biosystems - Volume 106, Issues 2–3, November–December 2011, Pages 67–75
نویسندگان
, , ,