Article ID Journal Published Year Pages File Type
1242693 Talanta 2010 8 Pages PDF
Abstract

In QSRR the retention is modeled as a function of structural or molecular descriptors. Since the structural datasets can be very large a selection of informative variables is often required. But beside the question which subset of variables (descriptors) produces optimum predictions one should answer the question: can good prediction be used in the QSRR community even if the physical meaning of applied descriptors is hard to interpret?The main focus in this paper is put on different modeling methodologies applied and molecular descriptors used in the QSRR approaches. Besides the widely used multiple linear regression (MLR), these methodologies include partial least squares (PLS), uninformative variable elimination partial least squares (UVE-PLS), genetic algorithms (GA) prior to MLR or PLS. The comparison will focus on the predictive performance but also on the descriptors found to be most important for the chromatographic retention prediction of peptides. The results of this study showed that stepwise-MLR and UVE-PLS are producing better predictions than the rest of the studied methodologies. From the variables selected by various methodologies one can see that the important information for the retention mechanism of RPLC was given by 2D-, 3D-descriptors and descriptors from the empirical QSRR equations, which bring the information about hydrogen-bonding properties, molecular size, and complexity. Overall, for the considered data set the empirical QSRR models were predicting the peptides retention best.

Keywords
Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,