کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5906161 1159958 2013 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Protein sequence comparison based on K-string dictionary
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
پیش نمایش صفحه اول مقاله
Protein sequence comparison based on K-string dictionary
چکیده انگلیسی


- K-string dictionary can solve high-dimensional protein representation problem.
- The cardinality of K-string dictionary is studied by real and simulated datasets.
- This method can save a lot of memory space when calculating in computers.

The current K-string-based protein sequence comparisons require large amounts of computer memory because the dimension of the protein vector representation grows exponentially with K. In this paper, we propose a novel concept, the “K-string dictionary”, to solve this high-dimensional problem. It allows us to use a much lower dimensional K-string-based frequency or probability vector to represent a protein, and thus significantly reduce the computer memory requirements for their implementation. Furthermore, based on this new concept, we use Singular Value Decomposition to analyze real protein datasets, and the improved protein vector representation allows us to obtain accurate gene trees.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Gene - Volume 529, Issue 2, 25 October 2013, Pages 250-256
نویسندگان
, , ,