Article ID Journal Published Year Pages File Type
443515 Journal of Molecular Graphics and Modelling 2015 7 Pages PDF
Abstract

•The proposed method remarkably improves the predictive quality of viral phosphorylation site.•Acidic residues contribute to the occurrence of viral phosphorylation site.•There are distinct residue-conservative differences for virus phosphorylation site.

Phosphorylation of viral proteins plays important roles in enhancing replication and inhibition of normal host-cell functions. Given its importance in biology, a unique opportunity has arisen to identify viral protein phosphorylation sites. However, experimental methods for identifying phosphorylation sites are resource intensive. Hence, there is significant interest in developing computational methods for reliable prediction of viral phosphorylation sites from amino acid sequences. In this study, a new method based on support vector machine is proposed to identify protein phosphorylation sites in viruses. We apply an encoding scheme based on attribute grouping and position weight amino acid composition to extract physicochemical properties and sequence information of viral proteins around phosphorylation sites. By 10-fold cross-validation, the prediction accuracies for phosphoserine, phosphothreonine and phosphotyrosine with window size of 23 are 88.8%, 95.2% and 97.1%, respectively. Furthermore, compared with the existing methods of Musite and MDD-clustered HMMs, the high sensitivity and accuracy of our presented method demonstrate the predictive effectiveness of the identified phosphorylation sites for viral proteins.

Graphical abstractA new method, in which SVM incorporated EBAG + PWAA, is designed to identify viral phosphorylation sites. All training data were retrieved from the NCBI RefSeq and P3DB databases. Then, we used the sliding window strategy to extract positive and negative data from protein sequences as training data, which were represented by peptide sequences with serine, threonine and tyrosine symmetrically surrounded by flanking residues. Meanwhile, to further evaluate the performance of our method and compare with existing methods, an independent testing set was extracted from virPTM. To ensure unbiased and objective results, the ratio of positive and negative samples was 1:1. Subsequently, encoding scheme based on attribute grouping (EBAG) and position weight amino acid composition (PWAA) was utilized to extract sequence features. Feature analyses revealed that acidic residues contributed to the occurrence of viral phosphorylation sites, and there are distinct kinase-specific and residue-conservative differences for serine, threonine, and tyrosine phosphorylation sites of virus proteins.Figure optionsDownload full-size imageDownload high-quality image (273 K)Download as PowerPoint slide

Related Topics
Physical Sciences and Engineering Chemistry Physical and Theoretical Chemistry
Authors
, , , ,