Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1394798 | European Journal of Medicinal Chemistry | 2010 | 8 Pages |
The machine learning methods artificial neural network (ANN) and support vector machine (SVM) techniques were used to model intrinsic solubility of 74 generic drugs. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. Cluster analysis was used to split the data into a training set and test set. The appropriate descriptors were selected using a wrapper approach with multiple linear regressions as target learning algorithm. The descriptor selection and model building were performed with 10 fold cross validation using the training data set. The linear model fits the training set (n = 60) with R2 = 0.814, while ANN and SVM higher values of R2 = 0.823 and 0.835, respectively. Though the SVM model shows improvement of training set fitting, the ANN model was slightly superior to SVM and MLR in predicting the test set. The quantitative structure–property relationship study suggests that the theoretically calculated descriptors log P, first-order valence connectivity index (1χv), delta chi (Δ2χ) and information content (2IC) have relevant relationships with intrinsic solubility of generic drugs studied.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide