Article ID Journal Published Year Pages File Type
6370256 Journal of Theoretical Biology 2014 7 Pages PDF
Abstract
The extracellular matrix proteins (ECMs) are widely found in the tissues of multicellular organisms. They consist of various secreted proteins, mainly polysaccharides and glycoproteins. The ECMs involve the exchange of materials and information between resident cells and the external environment. Accurate identification of ECMs is a significant step in understanding the evolution of cancer as well as promises wide range of potential applications in therapeutic targets or diagnostic markers. In this paper, an accurate computational method named PECM is proposed for identifying ECMs. Here, we explore various sequence-derived discriminative features including evolutionary information, predicted secondary structure, and physicochemical properties. Rather than simply combining the features which may bring information redundancy and unwanted noises, we use Fisher-Markov selector and incremental feature selection approach to search the optimal feature subsets. Then, we train our model by the technique of support vector machine (SVM). PECM achieves good prediction performance with the ACC scores about 86% and 90% on testing and independent datasets, which are competitive with the state-of-the-art ECMs prediction tools. A web-server named PECM which implements the proposed approach is freely available at http://59.73.198.144:8088/PECM/.
Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, , , ,