کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5914060 1162719 2014 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شناسی مولکولی
پیش نمایش صفحه اول مقاله
TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats
چکیده انگلیسی
The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such “hidden” repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Structural Biology - Volume 186, Issue 3, June 2014, Pages 386-391
نویسندگان
, ,