کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
5914060 | 1162719 | 2014 | 6 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
ROCHMMLRRTetratrico Peptide RepeatTandem repeats - Tandem تکرار می شودSequence analysis - تجزیه و تحلیل توالیTandem repeat - تکرار دوبارهTPR - روش پاسخ فیزیکیtrue positive - مثبت واقعیfalse positive - مثبت کاذبHidden Markov model - مدل پنهان مارکوف Proteomes - پروتئوم هاreceiver operating characteristic - گیرنده عامل عامل
موضوعات مرتبط
علوم زیستی و بیوفناوری
بیوشیمی، ژنتیک و زیست شناسی مولکولی
زیست شناسی مولکولی
پیش نمایش صفحه اول مقاله
![عکس صفحه اول مقاله: TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats](/preview/png/5914060.png)
چکیده انگلیسی
The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such “hidden” repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Structural Biology - Volume 186, Issue 3, June 2014, Pages 386-391
Journal: Journal of Structural Biology - Volume 186, Issue 3, June 2014, Pages 386-391
نویسندگان
François D. Richard, Andrey V. Kajava,