کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4497376 1318931 2010 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Predicting the state of cysteines based on sequence information
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
Predicting the state of cysteines based on sequence information
چکیده انگلیسی
A three-stage support vector machine (SVM) was constructed to predict the state of cysteines by fusing sequence information, evolution information and annotation information of protein sequences. The first and second stages were for predicting whether the protein sequences contain disulfide bonds and whether all of the cysteines are involved in disulfide bonds. In the last stage, one SVM was constructed for predicting which cysteines are involved in disulfide bonds, among all these cysteines in proteins. The three SVMs give a good performance and the overall prediction accuracy are 90.05%, 96.36% and 80.00%, respectively, which indicates that the features selected in this work are effective for predicting the state of cysteines. In addition, current methods only paid too much attention to the prediction performance and never showed us how much important the roles of these features played in the prediction. As a result a feature importance measurement designated as F-score function was used to evaluate these features. The result shows that among these protein descriptors; evolution information is the most important feature for representing the disulfide-containing proteins. The prediction software and data sets used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/Predict_Cys.zip.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Theoretical Biology - Volume 267, Issue 3, 7 December 2010, Pages 312-318
نویسندگان
, , , , , , ,