TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
5914060	1162719	2014	6 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

ROC HMM LRR Tetratrico Peptide Repeat Tandem repeats - Tandem تکرار می شود Sequence analysis - تجزیه و تحلیل توالی Tandem repeat - تکرار دوباره TPR - روش پاسخ فیزیکی true positive - مثبت واقعی false positive - مثبت کاذب Hidden Markov model - مدل پنهان مارکوف Proteomes - پروتئوم ها receiver operating characteristic - گیرنده عامل عامل

موضوعات مرتبط

علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شناسی مولکولی

پیش نمایش صفحه اول مقاله

TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats

چکیده انگلیسی

The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such “hidden” repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Structural Biology - Volume 186, Issue 3, June 2014, Pages 386-391

نویسندگان

François D. Richard, Andrey V. Kajava,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats

دسترسی سریع

ارتباط

English Website