Article ID Journal Published Year Pages File Type
1166496 Analytica Chimica Acta 2011 8 Pages PDF
Abstract

Large amounts of data from high-throughput metabolomics experiments become commonly more and more complex, which brings an enormous amount of challenges to existing statistical modeling. Thus there is a need to develop statistically efficient approach for mining the underlying metabolite information contained by metabolomics data under investigation. In the work, we developed a novel kernel Fisher discriminant analysis (KFDA) algorithm by constructing an informative kernel based on decision tree ensemble. The constructed kernel can effectively encode the similarities of metabolomics samples between informative metabolites/biomarkers in specific parts of the measurement space. Simultaneously, informative metabolites or potential biomarkers can be successfully discovered by variable importance ranking in the process of building kernel. Moreover, KFDA can also deal with nonlinear relationship in the metabolomics data by such a kernel to some extent. Finally, two real metabolomics datasets together with a simulated data were used to demonstrate the performance of the proposed approach through the comparison of different approaches.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► We developed a novel KFDA algorithm by constructing an informative kernel based on decision tree ensemble. ► The kernel can encode the similarities of metabolomics samples between biomarkers space. ► KFDA can also deal with nonlinear relationship in the metabolomics data by such a kernel. ► Two real datasets together with a simulated data demonstrated the performance of KFDA.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , , , , , ,