Evaluation of a neural networks QSAR method based on ligand representation using substituent descriptors: Application to HIV-1 protease inhibitors

Article ID	Journal	Published Year	Pages	File Type
443875	Journal of Molecular Graphics and Modelling	2006	9 Pages	PDF

Abstract

We present here a neural networks method designed to predict biological activity based on a local representation of the ligand. The compounds of the series are represented by a vector mapping for each of four substituent properties: volume, log P, dipole moment and a simple ‘steric’ parameter relating to its shape. This ligand representation was tested using neural networks on a set of 42 cyclic-urea derivatives, inhibiting HIV-1 protease. The leave-one-out cross-validation using all descriptors in the input gave a correlation factor between prediction and experiment of 0.76 for the overall set and 0.88 when three outliers were left out. To rank the significance of the four descriptors, we further tested all combinations of two and three parameters for each substituent, using two disjunctive testing sets of five inhibitors. In these sets, vectors with extreme descriptor values were used either in the training or the testing set (sets A and B, respectively). The method is a very good interpolator (set A, 95 ± 2% accuracy) but a less effective extrapolator (set B, 85 ± 2% accuracy). Generally, the combinations including the ‘steric’ parameter predict better than average, while those containing the volume are less effective. The best prediction, 98.8 ± 1.2%, was obtained when log P, the dipole and the steric parameter were used on set A. At the opposite end, the lowest ranked descriptor set was obtained when replacing log P with the volume, giving 92.3 ± 6.7% accuracy over the set A.

Keywords

QSAR Molecular descriptor Neural networks HIV-1 protease inhibitors Compound library