Statistical data processing in clinical proteomics

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
1215135	1494172	2008	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Permutation test - آزمون مجذور Statistical validation - اعتبار سنجی آماری Feature selection - انتخاب ویژگی Multivariate data analysis - تجزیه و تحلیل داده های چند متغیره Double cross-validation - دوبار اعتبار سنجی Classification - طبقه بندی Curse of dimensionality - نفرین ابعاد Proteomics - پروتئومیکس Biomarker discovery - کشف بیومارکر

موضوعات مرتبط

مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه

پیش نمایش صفحه اول مقاله

Statistical data processing in clinical proteomics

چکیده انگلیسی

This review discusses data analysis strategies for the discovery of biomarkers in clinical proteomics. Proteomics studies produce large amounts of data, characterized by few samples of which many variables are measured. A wealth of classification methods exists for extracting information from the data. Feature selection plays an important role in reducing the dimensionality of the data prior to classification and in discovering biomarker leads. The question which classification strategy works best is yet unanswered. Validation is a crucial step for biomarker leads towards clinical use. Here we only discuss statistical validation, recognizing that biological and clinical validation is of utmost importance. First, there is the need for validated model selection to develop a generalized classifier that predicts new samples correctly. A cross-validation loop that is wrapped around the model development procedure assesses the performance using unseen data. The significance of the model should be tested; we use permutations of the data for comparison with uninformative data. This procedure also tests the correctness of the performance validation. Preferably, a new set of samples is measured to test the classifier and rule out results specific for a machine, analyst, laboratory or the first set of samples. This is not yet standard practice. We present a modular framework that combines feature selection, classification, biomarker discovery and statistical validation; these data analysis aspects are all discussed in this review. The feature selection, classification and biomarker discovery modules can be incorporated or omitted to the preference of the researcher. The validation modules, however, should not be optional. In each module, the researcher can select from a wide range of methods, since there is not one unique way that leads to the correct model and proper validation. We discuss many possibilities for feature selection, classification and biomarker discovery. For validation we advice a combination of cross-validation and permutation testing, a validation strategy supported in the literature.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Chromatography B - Volume 866, Issues 1–2, 15 April 2008, Pages 77–88

نویسندگان

Suzanne Smit, Huub C.J. Hoefsloot, Age K. Smilde,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Statistical data processing in clinical proteomics

دسترسی سریع

ارتباط

English Website