کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6469119 | 1423738 | 2017 | 9 صفحه PDF | دانلود رایگان |
- Development of QSPR model to capture influence of reactant structure and solvent effect on reaction rate.
- Connectivity indices coupled with multivariate statistical methods used in model development.
- Hybrid Genetic Algorithm (GA) and Decision Tree (DT) method developed.
In recent years, Computer-Aided Molecular Design (CAMD) has been extensively used for defining and designing reactions at their maximal potential. In all of these contributions, either the structures of reactants/products have been considered to be unchanging or the solvent structure. Developing a QSPR model which not only captures the influence of reactant structures but also the solvent effect on reaction rate, is essential. Since the structures of reactants and products are related, such QSPR models will serve as a prerequisite for the simultaneous CAMD of reactants, products and solvents. They will also provide a useful tool for predicting the rate constant without relying on experiments. To develop such a QSPR, in our work, the Diels-Alder reaction with different sets of reactants and solvents was investigated. Connectivity indices were used to represent the structures of the members of each set. Principal Component Analysis (PCA) was applied to identify principal components (PCs) corresponding to the structures of reactants and solvent of each set. Linear models expressed in terms of PCs were then generated using a Decision Tree (DT) algorithm such that the R2 value was maximized. These models formed the initial population on which the GA performed operations such as crossover and mutation to obtain model(s) with best rate constant prediction. Thus, the novelty of our approach is that after feature extraction using PCA, a DT algorithm generates an ensemble of linear models, which through the GA is transformed into a model with best fit. Our approach required much lesser generations to provide a model with highest R2ext value as compared to the case where the DT did not initialize the population of models.
Journal: Computers & Chemical Engineering - Volume 106, 2 November 2017, Pages 690-698