کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
8376814 | 1543161 | 2018 | 7 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Human blood gene signature as a marker for smoking exposure: Computational approaches of the top ranked teams in the sbv IMPROVER Systems Toxicology challenge
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
SbVGene signature - امضای ژنLinear discriminant analysis - تجزیه و تحلیل خطی خطیLDA - تخصیص پنهان دیریکلهLeast absolute shrinkage and selection operator - حداقل اپراتور انقباض و انتخاب مطلقKEGG یا Kyoto Encyclopedia of Genes and Genomes - دایرة المعارف ژن ها و ژنوم کیوتو Kyoto Encyclopedia of Genes and Genomes - دایره المعارف ژنتیک ژن ها و ژنوم کیوتوSystems toxicology - سمشناسی سیستمPredictive modeling - مدل سازی پیش بینی شدهLASSO - کمند
موضوعات مرتبط
مهندسی و علوم پایه
ریاضیات
ریاضیات محاسباتی
پیش نمایش صفحه اول مقاله
چکیده انگلیسی
Crowdsourcing has emerged as a framework to address methodological challenges in omics data analysis and assess the extent to which omics data are predictive of phenotypes of interest. The sbv IMPROVER Systems Toxicology challenge was designed to leverage crowdsourcing to determine whether human blood gene expression levels are informative of current and past smoking. Participating teams were invited to use a training gene expression dataset to derive parsimonious models (up to 40 genes) that can accurately classify subjects into exposure groups: smokers, former smokers that quit for at least one year, and never-smokers. Teams were ranked based on two classification performance metrics evaluated on a blinded test dataset. The analytical approaches of the first- and third-ranked teams, that are presented in detail in this article, involved feature selection by moderated t-test or LASSO regression and linear discriminant analysis (LDA) and logistic regression classifiers, respectively. While the 12-gene signature of the top team allowed the classification of current smokers with 100% sensitivity at 93% specificity, discriminating former smokers from never-smokers was much more challenging (65% sensitivity at 57% specificity). Gene ontology molecular functions and KEGG pathways associated with current smoking included G protein-coupled receptor activity, signaling receptor activity, calcium ion binding, and the Neuroactive ligand-receptor interaction pathway. Selection of marker genes by either moderated t-test or multivariate LASSO regression followed by LDA or logistic regression, are robust approaches to classification with omics data, confirming in part findings of previous sbv IMPROVER challenges. While current smoking is accurately identified based on blood mRNA levels, smoking cessation for more than one year is accompanied by a “normalization” of the expression of certain mRNAs, making it difficult to distinguish former smokers from never-smokers.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Toxicology - Volume 5, February 2018, Pages 31-37
Journal: Computational Toxicology - Volume 5, February 2018, Pages 31-37
نویسندگان
Adi L. Tarca, Xiaofeng Gong, Roberto Romero, Wenxin Yang, Zhongqu Duan, Hao Yang, Chengfang Zhang, Peixuan Wang,