Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
1151213	958201	2006	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Clustering - خوشه بندی Microarray - ریزآرایه Classification - طبقه بندی Mass spectrometry - طیف سنجی جرمی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه ریاضیات آمار و احتمال

پیش نمایش صفحه اول مقاله

Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples

چکیده انگلیسی

This is a comparative study of various clustering and classification algorithms as applied to differentiate cancer and non-cancer protein samples using mass spectrometry data. Our study demonstrates the usefulness of a feature selection step prior to applying a machine learning tool. A natural and common choice of a feature selection tool is the collection of marginal pp-values obtained from tt-tests for testing the intensity differences at each m/zm/z ratio in the cancer versus non-cancer samples. We study the effect of selecting a cutoff in terms of the overall Type 1 error rate control on the performance of the clustering and classification algorithms using the significant features. For the classification problem, we also considered m/zm/z selection using the importance measures computed by the Random Forest algorithm of Breiman. Using a data set of proteomic analysis of serum from ovarian cancer patients and serum from cancer-free individuals in the Food and Drug Administration and National Cancer Institute Clinical Proteomics Database, we undertake a comparative study of the net effect of the machine learning algorithm–feature selection tool–cutoff criteria combination on the performance as measured by an appropriate error rate measure.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Statistical Methodology - Volume 3, Issue 1, January 2006, Pages 79–92

نویسندگان

Susmita Datta, Lara M. DePadilla,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples

دسترسی سریع

ارتباط

English Website