Article ID Journal Published Year Pages File Type
6451248 Computational Biology and Chemistry 2016 7 Pages PDF
Abstract

•A GO-driven method to predict protein subcellular localization.•Deriving information content from the lower part of GO graph.•Constructing feature vector by combining information from both upper and lower parts of the three GO graphs.•Integrated features generally outperform than individual feature.•Integrated feature with SVM classifier produce over 90% prediction accuracy.

Predicting the location where a protein resides within a cell is important in cell biology. Computational approaches to this issue have attracted more and more attentions from the community of biomedicine. Among the protein features used to predict the subcellular localization of proteins, the feature derived from Gene Ontology (GO) has been shown to be superior to others. However, most of the sights in this field are set on the presence or absence of some predefined GO terms. We proposed a method to derive information from the intrinsic structure of the GO graph. The feature vector was constructed with each element in it representing the information content of the GO term annotating to a protein investigated, and the support vector machines was used as classifier to test our extracted features. Evaluation experiments were conducted on three protein datasets and the results show that our method can enhance eukaryotic and human subcellular location prediction accuracy by up to 1.1% better than previous studies that also used GO-based features. Especially in the scenario where the cellular component annotation is absent, our method can achieved satisfied results with an overall accuracy of more than 87%.

Graphical abstractDownload high-res image (246KB)Download full-size image

Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
, ,