کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4752602 1416277 2017 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Enzyme classification using multiclass support vector machine and feature subset selection
ترجمه فارسی عنوان
طبقه بندی آنزیم با استفاده از یک ماشین بردار پشتیبانی چندگانه و انتخاب زیر مجموعه ویژگی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
چکیده انگلیسی


- Physico-chemical properties important for protein function classification are identified by using Orthogonal Forward Selection (OFS) method.
- A given Protein is classified as Enzyme or Non-Enzyme by using Binary SVM.
- An enzyme is classified into 6-functional classes using multiclass SVM.
- On comparison it is seen that our model (OFS followed by multiclass SVM) performs better than other methods.

Proteins are the macromolecules responsible for almost all biological processes in a cell. With the availability of large number of protein sequences from different sequencing projects, the challenge with the scientist is to characterize their functions. As the wet lab methods are time consuming and expensive, many computational methods such as FASTA, PSI-BLAST, DNA microarray clustering, and Nearest Neighborhood classification on protein-protein interaction network have been proposed. Support vector machine is one such method that has been used successfully for several problems such as protein fold recognition, protein structure prediction etc. Cai et al. in 2003 have used SVM for classifying proteins into different functional classes and to predict their function. They used the physico-chemical properties of proteins to represent the protein sequences. In this paper a model comprising of feature subset selection followed by multiclass Support Vector Machine is proposed to determine the functional class of a newly generated protein sequence. To train and test the model for its performance, 32 physico-chemical properties of enzymes from 6 enzyme classes are considered. To determine the features that contribute significantly for functional classification, Sequential Forward Floating Selection (SFFS), Orthogonal Forward Selection (OFS), and SVM Recursive Feature Elimination (SVM-RFE) algorithms are used and it is observed that out of 32 properties considered initially, only 20 features are sufficient to classify the proteins into its functional classes with an accuracy ranging from 91% to 94%. On comparison it is seen that, OFS followed by SVM performs better than other methods. Our model generalizes the existing model to include multiclass classification and to identify most significant features affecting the protein function.

146

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 70, October 2017, Pages 211-219
نویسندگان
, , ,