کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
10884712 | 1079482 | 2005 | 8 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Automated derivation and refinement of sequence length patterns for protein sequences using evolutionary computation
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
ریاضیات
مدلسازی و شبیه سازی
پیش نمایش صفحه اول مقاله
![عکس صفحه اول مقاله: Automated derivation and refinement of sequence length patterns for protein sequences using evolutionary computation Automated derivation and refinement of sequence length patterns for protein sequences using evolutionary computation](/preview/png/10884712.png)
چکیده انگلیسی
Several stratagems are used in protein bioinformatics for the classification of proteins based on sequence, structure or function. We explore the concept of a minimal signature embedded in a sequence that defines the likely position of a protein in a classification. Specifically, we address the derivation of sparse profiles for the G-protein coupled receptor (GPCR) clan of integral membrane proteins. We present an evolutionary algorithm (EA) for the derivation of sparse profiles (signatures) without the need to supply a multiple alignment. We also apply an evolution strategy (ES) to the problem of pattern and profile refinement. Patterns were derived for the GPCR 'superfamily' and GPCR families 1-3 individually from starting populations of randomly generated signatures, using a database of integral membrane protein sequences and an objective function using a modified receiver operator characteristic (ROC) statistic. The signature derived for the family 1 GPCR sequences was shown to perform very well in a stringent cross-validation test, detecting 76% of unseen GPCR sequences at 5% error. Application of the ES refinement method to a signature developed by a previously described method [Sadowski, M.I., Parish, J.H., 2003. Automated generation and refinement of protein signatures: case study with G-protein coupled receptors. Bioinformatics 19, 727-734] resulted in a 6% increase of coverage for 5% error as measured in the validation test. We note that there might be a limit to this or any classification of proteins based on patterns or schemata.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Biosystems - Volume 81, Issue 3, September 2005, Pages 247-254
Journal: Biosystems - Volume 81, Issue 3, September 2005, Pages 247-254
نویسندگان
M.I. Sadowski, J.H. Parish, D.R. Westhead,