کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
443515 692730 2015 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using support vector machines to identify protein phosphorylation sites in viruses
ترجمه فارسی عنوان
با استفاده از ماشین های بردار پشتیبانی برای شناسایی سایت های فسفوریلاسیون پروتئین در ویروس ها
کلمات کلیدی
سایت فسفریلاسیون، پروتئین ویروس، ماشین بردار پشتیبانی، طرح رمزگذاری بر اساس گروه بندی ویژگی، وزن اسید آمینه اسید
موضوعات مرتبط
مهندسی و علوم پایه شیمی شیمی تئوریک و عملی
چکیده انگلیسی


• The proposed method remarkably improves the predictive quality of viral phosphorylation site.
• Acidic residues contribute to the occurrence of viral phosphorylation site.
• There are distinct residue-conservative differences for virus phosphorylation site.

Phosphorylation of viral proteins plays important roles in enhancing replication and inhibition of normal host-cell functions. Given its importance in biology, a unique opportunity has arisen to identify viral protein phosphorylation sites. However, experimental methods for identifying phosphorylation sites are resource intensive. Hence, there is significant interest in developing computational methods for reliable prediction of viral phosphorylation sites from amino acid sequences. In this study, a new method based on support vector machine is proposed to identify protein phosphorylation sites in viruses. We apply an encoding scheme based on attribute grouping and position weight amino acid composition to extract physicochemical properties and sequence information of viral proteins around phosphorylation sites. By 10-fold cross-validation, the prediction accuracies for phosphoserine, phosphothreonine and phosphotyrosine with window size of 23 are 88.8%, 95.2% and 97.1%, respectively. Furthermore, compared with the existing methods of Musite and MDD-clustered HMMs, the high sensitivity and accuracy of our presented method demonstrate the predictive effectiveness of the identified phosphorylation sites for viral proteins.

A new method, in which SVM incorporated EBAG + PWAA, is designed to identify viral phosphorylation sites. All training data were retrieved from the NCBI RefSeq and P3DB databases. Then, we used the sliding window strategy to extract positive and negative data from protein sequences as training data, which were represented by peptide sequences with serine, threonine and tyrosine symmetrically surrounded by flanking residues. Meanwhile, to further evaluate the performance of our method and compare with existing methods, an independent testing set was extracted from virPTM. To ensure unbiased and objective results, the ratio of positive and negative samples was 1:1. Subsequently, encoding scheme based on attribute grouping (EBAG) and position weight amino acid composition (PWAA) was utilized to extract sequence features. Feature analyses revealed that acidic residues contributed to the occurrence of viral phosphorylation sites, and there are distinct kinase-specific and residue-conservative differences for serine, threonine, and tyrosine phosphorylation sites of virus proteins.Figure optionsDownload high-quality image (273 K)Download as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Molecular Graphics and Modelling - Volume 56, March 2015, Pages 84–90
نویسندگان
, , , ,