Article ID Journal Published Year Pages File Type
443875 Journal of Molecular Graphics and Modelling 2006 9 Pages PDF
Abstract

We present here a neural networks method designed to predict biological activity based on a local representation of the ligand. The compounds of the series are represented by a vector mapping for each of four substituent properties: volume, log P, dipole moment and a simple ‘steric’ parameter relating to its shape. This ligand representation was tested using neural networks on a set of 42 cyclic-urea derivatives, inhibiting HIV-1 protease. The leave-one-out cross-validation using all descriptors in the input gave a correlation factor between prediction and experiment of 0.76 for the overall set and 0.88 when three outliers were left out. To rank the significance of the four descriptors, we further tested all combinations of two and three parameters for each substituent, using two disjunctive testing sets of five inhibitors. In these sets, vectors with extreme descriptor values were used either in the training or the testing set (sets A and B, respectively). The method is a very good interpolator (set A, 95 ± 2% accuracy) but a less effective extrapolator (set B, 85 ± 2% accuracy). Generally, the combinations including the ‘steric’ parameter predict better than average, while those containing the volume are less effective. The best prediction, 98.8 ± 1.2%, was obtained when log P, the dipole and the steric parameter were used on set A. At the opposite end, the lowest ranked descriptor set was obtained when replacing log P with the volume, giving 92.3 ± 6.7% accuracy over the set A.

Related Topics
Physical Sciences and Engineering Chemistry Physical and Theoretical Chemistry
Authors
, , ,