دانلود رایگان مقاله: یک مطالعه مقایسه ای از ویژگی های انتخاب و طبقه بندی مدرن برای تجزیه و تحلیل داده های طیف سنج جرمی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
1164636	1491003	2014	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

A comparative investigation of modern feature selection and classification approaches for the analysis of mass spectrometry data

ترجمه فارسی عنوان

یک مطالعه مقایسه ای از ویژگی های انتخاب و طبقه بندی مدرن برای تجزیه و تحلیل داده های طیف سنج جرمی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

انتخاب متغیر، نظارت بر یادگیری، بوت استرپینگ، دوبار اعتبار سنجی، طیف سنجی جرمی پیرولیز، باسیلوس

Variable selection - انتخاب متغیر Bacillus - باسیلوس Double cross-validation - دوبار اعتبار سنجی Bootstrapping - راه ‌اندازی خودکار، بوت‌ استرپینگ Pyrolysis mass spectrometry - طیف سنجی جرمی پیرولیز Supervised learning - نظارت بر یادگیری

موضوعات مرتبط

مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه

پیش نمایش مقاله

یک مطالعه مقایسه ای از ویژگی های انتخاب و طبقه بندی مدرن برای تجزیه و تحلیل داده های طیف سنج جرمی

چکیده انگلیسی

• LDA, PLS-DA, SVM and RF analyses were applied to MS data.
• Double cross-validation using bootstrapping was employed to assess models.
• For all classifications, all bacteria were assessed with ∼95% accuracy.
• Parsimonious modelling was used on a reduced set of mass ions and was more robust.
• The approaches developed are equally applicable to any multivariate data.

Many analytical approaches such as mass spectrometry generate large amounts of data (input variables) per sample analysed, and not all of these variables are important or related to the target output of interest. The selection of a smaller number of variables prior to sample classification is a widespread task in many research studies, where attempts are made to seek the lowest possible set of variables that are still able to achieve a high level of prediction accuracy; in other words, there is a need to generate the most parsimonious solution when the number of input variables is huge but the number of samples/objects are smaller. Here, we compare several different variable selection approaches in order to ascertain which of these are ideally suited to achieve this goal. All variable selection approaches were applied to the analysis of a common set of metabolomics data generated by Curie-point pyrolysis mass spectrometry (Py-MS), where the goal of the study was to classify the Gram-positive bacteria Bacillus. These approaches include stepwise forward variable selection, used for linear discriminant analysis (LDA); variable importance for projection (VIP) coefficient, employed in partial least squares-discriminant analysis (PLS-DA); support vector machines-recursive feature elimination (SVM-RFE); as well as the mean decrease in accuracy and mean decrease in Gini, provided by random forests (RF). Finally, a double cross-validation procedure was applied to minimize the consequence of overfitting. The results revealed that RF with its variable selection techniques and SVM combined with SVM-RFE as a variable selection method, displayed the best results in comparison to other approaches.

Figure optionsDownload as PowerPoint slide

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Analytica Chimica Acta - Volume 829, 4 June 2014, Pages 1–8

نویسندگان

Piotr S. Gromski, Yun Xu, Elon Correa, David I. Ellis, Michael L. Turner, Royston Goodacre,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : یک مطالعه مقایسه ای از ویژگی های انتخاب و طبقه بندی مدرن برای تجزیه و تحلیل داده های طیف سنج جرمی

دسترسی سریع

ارتباط

English Website