کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
409841 679099 2012 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Protein secondary structure prediction using DWKF based on SVR-NSGAII
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Protein secondary structure prediction using DWKF based on SVR-NSGAII
چکیده انگلیسی

Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. By introduction of machine learning for protein structure prediction, a solution has brought to this challenge to some extent. In the literature of Machine learning or data mining, regression and classification problems are typically viewed as two distinct problems differentiated by continuous or categorical dependent variable. There are endeavors to use regression methods to solve the classification problem and vice versa. To regard a classification problem as a regression one, we proposed a method which is based on Support Vector Regression (SVR) classification model as one of the powerful methods in the field of machine intelligence. We applied non-dominated Sorting Genetic Algorithm II (NSGAII) to find mapping points (MPs) for rounding a real-value to an integer one. Also NSGAII is used for finding out and tuning SVR kernel parameters optimally to enhance the performance of our model and achieve better results. At the other hand, using a suitable SVR kernel function for a particular problem can improve the prediction results remarkably but there is not a kernel which can predict all protein secondary structure classes with acceptable accuracy. Therefore we use a Dynamic Weighted Kernel Fusion (DWKF) method for fusing of three SVR kernels to achieve a supreme performance. Also to improve our method, Position Scoring Matrix (PSSM) profiles are used as the input information to it. The goals of this research are to regulate SVR parameters and fuse different SVR kernel outputs in order to determine protein secondary structure classes accurately. The obtained classification accuracies of our method are 85.79% and 84.94% on RS126 and CB513 datasets respectively and they are promising with regard to other classification methods in the literature. Moreover, for gauging our method behavior in comparison to other state of arts methods, an independent dataset is used and achieves 81.4% accuracy. Our method cannot achieve the best value for any considered performance metrics on an independent dataset but its values for whole metrics are quite acceptable.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 94, 1 October 2012, Pages 87–101
نویسندگان
, ,