کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
8876622 1623758 2018 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
DPP-PseAAC: A DNA-binding protein prediction model using Chou's general PseAAC
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم کشاورزی و بیولوژیک (عمومی)
پیش نمایش صفحه اول مقاله
DPP-PseAAC: A DNA-binding protein prediction model using Chou's general PseAAC
چکیده انگلیسی
A DNA-binding protein (DNA-BP) is a protein that can bind and interact with a DNA. Identification of DNA-BPs using experimental methods is expensive as well as time consuming. As such, fast and accurate computational methods are sought for predicting whether a protein can bind with a DNA or not. In this paper, we focus on building a new computational model to identify DNA-BPs in an efficient and accurate way. Our model extracts meaningful information directly from the protein sequences, without any dependence on functional domain or structural information. After feature extraction, we have employed Random Forest (RF) model to rank the features. Afterwards, we have used Recursive Feature Elimination (RFE) method to extract an optimal set of features and trained a prediction model using Support Vector Machine (SVM) with linear kernel. Our proposed method, named as DNA-binding Protein Prediction model using Chou's general PseAAC (DPP-PseAAC), demonstrates superior performance compared to the state-of-the-art predictors on standard benchmark dataset. DPP-PseAAC achieves accuracy values of 93.21%, 95.91% and 77.42% for 10-fold cross-validation test, jackknife test and independent test respectively. The source code of DPP-PseAAC, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/DNABinding. A publicly accessible web interface has also been established at: http://77.68.43.135:8080/DPP-PseAAC/.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Theoretical Biology - Volume 452, 7 September 2018, Pages 22-34
نویسندگان
, , , , ,