Article ID Journal Published Year Pages File Type
15021 Computational Biology and Chemistry 2015 17 Pages PDF
Abstract

•In this paper, random forest based approach is proposed to predict ion channels families and their subfamilies by using sequence derived features.•Here, amino acid composition, dipeptide composition, correlation features, composition, transition and distribution and pseudo amino acid composition are used to represent the protein sequence.•The minimum redundancy and maximum relevance feature selection is used to find the optimal number of features for improving the prediction performance.•Obtained high accuracies, MCC and ROC area values.

Ion channels are integral membrane proteins that are responsible for controlling the flow of ions across the cell. There are various biological functions that are performed by different types of ion channels. Therefore for new drug discovery it is necessary to develop a novel computational intelligence techniques based approach for the reliable prediction of ion channels families and their subfamilies. In this paper random forest based approach is proposed to predict ion channels families and their subfamilies by using sequence derived features. Here, seven feature vectors are used to represent the protein sample, including amino acid composition, dipeptide composition, correlation features, composition, transition and distribution and pseudo amino acid composition. The minimum redundancy and maximum relevance feature selection is used to find the optimal number of features for improving the prediction performance. The proposed method achieved an overall accuracy of 100%, 98.01%, 91.5%, 93.0%, 92.2%, 78.6%, 95.5%, 84.9%, MCC values of 1.00, 0.92, 0.88, 0.88, 0.90, 0.79, 0.91, 0.81 and ROC area values of 1.00, 0.99, 0.99, 0.99, 0.99, 0.95, 0.99 and 0.96 using 10-fold cross validation to predict the ion channels and non-ion channels, voltage gated ion channels and ligand gated ion channels, four subfamilies (calcium, potassium, sodium and chloride) of voltage gated ion channels, and four subfamilies of ligand gated ion channels and predict subfamilies of voltage gated calcium, potassium, sodium and chloride ion channels respectively.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide

Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
, ,