کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1993382 1064658 2013 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A structured approach to predictive modeling of a two-class problem using multidimensional data sets
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شیمی
پیش نمایش صفحه اول مقاله
A structured approach to predictive modeling of a two-class problem using multidimensional data sets
چکیده انگلیسی

Biological experiments in the post-genome era can generate a staggering amount of complex data that challenges experimentalists to extract meaningful information. Increasingly, the success of an appropriately controlled experiment relies on a robust data analysis pipeline. In this paper, we present a structured approach to the analysis of multidimensional data that relies on a close, two-way communication between the bioinformatician and experimentalist. A sequential approach employing data exploration (visualization, graphical and analytical study), pre-processing, feature reduction and supervised classification using machine learning is presented. This standardized approach is illustrated by an example from a proteomic data analysis that has been used to predict the risk of infectious disease outcome. Strategies for model selection and post hoc model diagnostics are presented and applied to the case illustration. We discuss some of the practical lessons we have learned applying supervised classification to multidimensional data sets, one of which is the importance of feature reduction in achieving optimal modeling performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Methods - Volume 61, Issue 1, 15 May 2013, Pages 73–85
نویسندگان
, , ,