کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1180728 1491540 2014 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Prediction of protein–protein binding affinity using diverse protein–protein interface features
ترجمه فارسی عنوان
پیش بینی اتصال وابستگی پروتئین به پروتئین با استفاده از ویژگی های رابط پروتئین پروتئینی
کلمات کلیدی
تعامل پروتئین-پروتئین، الزام آور، پیش بینی وابستگی، جنگل تصادفی ارزیابی اهمیت ویژگی
موضوعات مرتبط
مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه
چکیده انگلیسی


• An effective method was proposed using diverse interface features.
• The method outperforms other existing methods.
• The amount and diversity of protein complexes were highlighted.
• The flexible and rigid complexes were discussed respectively.

Protein–protein interactions play fundamental roles in almost all biological processes. Determining the protein–protein binding affinity has been recognized not only as an important step but also as a challenging task for further understanding of the molecular mechanism and the modeling of the biological systems. Unlike the traditional methods like empirical scoring algorithms and molecular dynamic which are time consuming, we developed a fast and reliable machine learning method for the prediction of protein–protein binding affinity. Based on diverse protein–protein interface features calculated using commonly used available tools, 432 features were obtained to represent hydrogen bond, Van der Waals force, hydrophobic interaction, electrostatic force, interface shape and configuration and allosteric effect. Considering the limited number of the available structures and affinity-known protein complexes, in order to avoid overfitting and remove noises in the feature set, feature importance evaluation was implemented and 154 optimal features were selected, then the prediction model based on random forest (RF) was constructed. We demonstrate that the RF model yields promising results and the predictive power of our method is better than other existing methods. Using leave-one-out cross-validation, our model gives a correlation coefficient (r) of 0.708 on the whole benchmark dataset of 133 complexes and a high r of 0.806 on the validated set of 53 samples. When performing the same two independent datasets, our method outperforms other two methods and achieves a high r of 0.793 and 0.907 respectively. All results indicate that our method can be a useful implement in determining protein–protein binding affinity.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 138, 15 November 2014, Pages 7–13
نویسندگان
, , , , ,