A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10537998	962891	2005	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

CART QSRR Genetic algorithms - الگوریتم های ژنتیک Bagging - بسته بندی کردن Gradient Boosting - تقویت گرادیان Random forests - جنگ های تصادفی Retention prediction - پیش بینی احتباس

موضوعات مرتبط

مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه

پیش نمایش صفحه اول مقاله

A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies

چکیده انگلیسی

As datasets are becoming larger, a solution to the problem of variable prediction, this problem is becoming harder. The problem is to define which subset of variables produces optimum predictions. The example studied aims to predict the chromatographic retention of 83 basic drugs on a Unisphere PBD column at pH 11.7 using 1272 molecular descriptors. The goal of this paper is to compare the relative performance of recently developed data mining methods, specifically classification and regression trees (CART), stochastic gradient boosting for tree-based models (Treeboost), and random forests (RF), with common statistical techniques in chemometrics; and genetic algorithms on multiple linear regression (GA-MLR), uninformative variable elimination partial least squares (UVE-PLS), and SIMPLS. The comparison will be performed primarily on predictive performance, but also on the variables found to be most important for the predictions. The results of this study indicated that, individually, GA-MLR (R2=0.93) outperformed all models. Further analysis found that a combination approach of GA-MLR and Treeboost (R2=0.98) further improved these results.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 76, Issue 2, 28 April 2005, Pages 185-196

نویسندگان

Tim Hancock, Raf Put, Danny Coomans, Yvan Vander Heyden, Yvette Everingham,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies

دسترسی سریع

ارتباط

English Website