کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1220917 | 1494608 | 2016 | 7 صفحه PDF | دانلود رایگان |
• QSRR model for prediction of peptides’ retention time has been proposed.
• 95 peptides from human HeLa cells proteomes were used for external validation.
• GA coupled with non-linear regression methods was used for variable selection.
• GA-SVR model has shown to be superior in predictive ability and interpretability.
Peptides’ retention time prediction is gaining increasing popularity in liquid chromatography–tandem mass spectrometry (LC–MS/MS)-based proteomics. This is a promising approach for improving successful proteome mapping, useful both in identification and quantification workflows. In this work, a quantitative structure-retention relationships (QSRR) model for its direct prediction from the molecular structure of 185 peptides originating from 8 well-characterized proteins and two Bacillus subtilis proteomes has been developed. Genetic Algorithm (GA) was used for selection of a subset of molecular descriptors coupled with three machine learning methods: Support Vector Regression (SVR), Artificial Neural Networks (ANN), and kernel Partial Least Squares (kPLS) for regression. Final GA-SVR, GA-ANN, and GA-kPLS models were validated through an external validation set of 95 peptides originating from the human epithelial HeLa cells proteomes. Robustness and stability was ensured by defining their applicability domain. The descriptors of the developed models were interpreted confirming a causal relationship between parameters of molecular structure and retention time. GA-SVR model has shown to be superior over the others in terms of both predictive ability, and interpretation of the selected descriptors.
Journal: Journal of Pharmaceutical and Biomedical Analysis - Volume 127, 5 August 2016, Pages 94–100