کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6451380 1416281 2017 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Research ArticleDevelopment of a sugar-binding residue prediction system from protein sequences using support vector machine
ترجمه فارسی عنوان
مقاله پژوهشی توسعه یک سیستم پیش بینی باقیماندن قند از توالی پروتئین با استفاده از دستگاه بردار پشتیبانی
کلمات کلیدی
ماشین بردار پشتیبانی، پروتئین های اتصال دهنده شکر، پیش بینی بقایای شکر، کربوهیدرات، فراگیری ماشین،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
چکیده انگلیسی


- Sugar-binding proteins (SBP) play essential functions in organisms.
- Experimental identification of SBP functions is both time-consuming and costly; in addition, computational methods may be useful for their prediction.
- We classify SBP into two classes: acidic SBP and nonacidic SBP and developed predictors for the two classes of SBP and their combination predictor using the support vector machine (SVM) from amino acid sequences.
- Our method achieved high values of the area under the receiver operating characteristic curve in a five-fold cross-validation test.

Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513).

220

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 66, February 2017, Pages 36-43
نویسندگان
, , , , , , , , ,