کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
416391 681358 2016 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Graph-theoretic multisample tests of equality in distribution for high dimensional data
ترجمه فارسی عنوان
تست چند تایی گراف نظری برابری در توزیع برای داده های با ابعاد بزرگ
کلمات کلیدی
مشکل چندگانه تطبیق کامل، حداقل درخت درختی نزدیکترین همسایه، انرژی، نمودار متعارف
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• We test whether two or more samples have arisen from the same multivariate density.
• Test uses new graphs as well as the minimum spanning tree or nearest neighbors.
• Test performs well and is easy to perform for large dd, small nn datasets.
• Power of the new tests is competitive or beats other general multivariate tests.
• Mean and variance of the asymptotically normal null distribution are easy to compute.

Testing whether two or more independent samples arise from a common distribution is a classic problem in statistics. Several multivariate two-sample tests of equality are based on graphs such as the minimum spanning tree, nearest neighbor, and optimal nonbipartite perfect matching. Here, the samples are pooled and the test statistic is the number of edges in the graph that connect points with different sample identities. These tests are typically unbiased and perform well when estimates of underlying probability densities are poor. However, these tests have not been thoroughly studied when data is very high dimensional or in the multisample case. We introduce the use of orthogonal perfect matchings for testing equality in distribution. A suite of Monte Carlo simulations on artificial and real data shows that orthogonal perfect matchings and spanning trees typically have higher power than other graphs and are also more effective at discerning when samples have differences in their covariance structure compared to other nonparametric tests such as the energy and triangle tests.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 96, April 2016, Pages 145–158
نویسندگان
,