کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
7546598 1489633 2018 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Asymptotic performance of PCA for high-dimensional heteroscedastic data
موضوعات مرتبط
مهندسی و علوم پایه ریاضیات آنالیز عددی
پیش نمایش صفحه اول مقاله
Asymptotic performance of PCA for high-dimensional heteroscedastic data
چکیده انگلیسی
Principal Component Analysis (PCA) is a classical method for reducing the dimensionality of data by projecting them onto a subspace that captures most of their variation. Effective use of PCA in modern applications requires understanding its performance for data that are both high-dimensional and heteroscedastic. This paper analyzes the statistical performance of PCA in this setting, i.e., for high-dimensional data drawn from a low-dimensional subspace and degraded by heteroscedastic noise. We provide simplified expressions for the asymptotic PCA recovery of the underlying subspace, subspace amplitudes and subspace coefficients; the expressions enable both easy and efficient calculation and reasoning about the performance of PCA. We exploit the structure of these expressions to show that, for a fixed average noise variance, the asymptotic recovery of PCA for heteroscedastic data is always worse than that for homoscedastic data (i.e., for noise variances that are equal across samples). Hence, while average noise variance is often a practically convenient measure for the overall quality of data, it gives an overly optimistic estimate of the performance of PCA for heteroscedastic data.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Multivariate Analysis - Volume 167, September 2018, Pages 435-452
نویسندگان
, , ,