Article ID Journal Published Year Pages File Type
6451254 Computational Biology and Chemistry 2016 8 Pages PDF
Abstract

•New protein fingerprints for capturing the topological properties of protein complexes in a linear format.•A SVM based predictive model for discriminating diabetes versus non-diabetes complexes with an AUC of 0.78.•Model tested on an external data set derived from text mining large number of PubMed abstracts.•Network modeling to identify new disease targets.

In order to understand the molecular mechanism underlying any disease, knowledge about the interacting proteins in the disease pathway is essential. The number of revealed protein-protein interactions (PPI) is still very limited compared to the available protein sequences of different organisms. Experiment based high-throughput technologies though provide some data about these interactions, those are often fairly noisy. Computational techniques for predicting protein-protein interactions therefore assume significance. 1296 binary fingerprints that encode a combination of structural and geometric properties were developed using the crystallographic data of 15,000 protein complexes in the pdb server. In a case study, these fingerprints were created for proteins implicated in the Type 2 diabetes mellitus disease. The fingerprints were input into a SVM based model for discriminating disease proteins from non disease proteins yielding a classification accuracy of 78.2% (AUC value of 0.78) on an external data set composed of proteins retrieved via text mining of diabetes related literature. A PPI network was constructed and analysed to explore new disease targets. The integrated approach exemplified here has a potential for identifying disease related proteins, functional annotation and other proteomics studies.

Graphical abstractDownload high-res image (155KB)Download full-size image

Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
, , , , , ,