Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4496100 | Journal of Theoretical Biology | 2014 | 8 Pages |
•A novel sequence-based method (ATPBR) is proposed for predicting ATP-binding residues.•ATPBR is good at predicting ATP-binding residues in proteins.•A novel feature PSSMPP is proposed and has advantages over other features.•The mRMR-IFS feature selection method is used and improves the prediction performance.
We develop a computational and statistical approach (ATPBR) for predicting ATP-binding residues in proteins from amino acid sequences by using random forests with a novel hybrid feature. The hybrid feature incorporates a new feature called PSSMPP, the predicted secondary structure and orthogonal binary vectors. The mRMR-IFS feature selection method is utilized to construct the best prediction model. At last, ATPBR achieves significantly improved performance over existing methods, with 87.53% accuracy and a Matthew׳s correlation coefficient of 0.554. In addition, our further analysis demonstrates that PSSMPP distinguishes more effectively between ATP-binding and non-binding residues. Besides, the optimal features selected by the mRMR-IFS method improve the prediction performance and may provide useful insights for revealing the mechanisms of ATP and proteins interactions.