Fused variable screening for massive imbalanced data

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
13430337	1842417	2020	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

High dimension - ابعاد بزرگ Imbalanced data - داده های نامتعادل Rank correlation - همبستگی رتبه

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

Fused variable screening for massive imbalanced data

چکیده انگلیسی

Imbalanced data, in which the data exhibit an unequal or highly-skewed distribution between its classes/categories, are pervasive in many scientific fields, with application range from bioinformatics, text classification, face recognition, fraud detection, etc. Imbalanced data in modern science are often of massive size and high dimensionality, for example, gene expression data for diagnosing rare diseases. To address this issue, a fused screening procedure is proposed for dimension reduction with large-scale high dimensional imbalanced data under repeated case-control samplings. There are several advantages of the proposed method: it is model-free without any model specification for the underlying distribution; it is relatively inexpensive in computational cost by using the subsampling technique; it is robust to outliers in the predictors. The theoretical properties are established under regularity conditions. Numerical studies including extensive simulations and a real data example confirm that the proposed method performs well in practical settings.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 141, January 2020, Pages 94-108

نویسندگان

Jinhan Xie, Meiling Hao, Wenxin Liu, Yuanyuan Lin,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Fused variable screening for massive imbalanced data

دسترسی سریع

ارتباط

English Website