Article ID Journal Published Year Pages File Type
1172674 Analytical Biochemistry 2016 6 Pages PDF
Abstract

As one important post-translational modification of prokaryotic proteins, pupylation plays a key role in regulating various biological processes. The accurate identification of pupylation sites is crucial for understanding the underlying mechanisms of pupylation. Although several computational methods have been developed for the identification of pupylation sites, the prediction accuracy of them is still unsatisfactory. Here, a novel bioinformatics tool named IMP–PUP is proposed to improve the prediction of pupylation sites. IMP–PUP is constructed on the composition of k-spaced amino acid pairs and trained with a modified semi-supervised self-training support vector machine (SVM) algorithm. The proposed algorithm iteratively trains a series of support vector machine classifiers on both annotated and non-annotated pupylated proteins. Computational results show that IMP–PUP achieves the area under receiver operating characteristic curves of 0.91, 0.73, and 0.75 on our training set, Tung's testing set, and our testing set, respectively, which are better than those of the different error costs SVM algorithm and the original self-training SVM algorithm. Independent tests also show that IMP–PUP significantly outperforms three other existing pupylation site predictors: GPS–PUP, iPUP, and pbPUP. Therefore, IMP–PUP can be a useful tool for accurate prediction of pupylation sites. A MATLAB software package for IMP–PUP is available at https://juzhe1120.github.io/.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, ,