Article ID Journal Published Year Pages File Type
1394798 European Journal of Medicinal Chemistry 2010 8 Pages PDF
Abstract

The machine learning methods artificial neural network (ANN) and support vector machine (SVM) techniques were used to model intrinsic solubility of 74 generic drugs. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. Cluster analysis was used to split the data into a training set and test set. The appropriate descriptors were selected using a wrapper approach with multiple linear regressions as target learning algorithm. The descriptor selection and model building were performed with 10 fold cross validation using the training data set. The linear model fits the training set (n = 60) with R2 = 0.814, while ANN and SVM higher values of R2 = 0.823 and 0.835, respectively. Though the SVM model shows improvement of training set fitting, the ANN model was slightly superior to SVM and MLR in predicting the test set. The quantitative structure–property relationship study suggests that the theoretically calculated descriptors log P, first-order valence connectivity index (1χv), delta chi (Δ2χ) and information content (2IC) have relevant relationships with intrinsic solubility of generic drugs studied.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide

Related Topics
Physical Sciences and Engineering Chemistry Organic Chemistry
Authors
, , ,