The matching pursuit algorithm revisited: A variant for big data and new stopping rules

Article ID	Journal	Published Year	Pages	File Type
11001562	Signal Processing	2019	12 Pages	PDF

Abstract

The matching pursuit algorithm (MPA) is used in many applications for selecting the best predictors for a vector of measurements of size n from a dictionary that contains pn atoms, where usually nâ¯â¤â¯pn. A major unsolved problem is to determine the optimal stopping rule. In this work, we investigate various stopping rules which are modifications of the information theoretic (IT) criteria derived for Gaussian linear regression. Because all of them involve the degrees of freedom (df) given by the trace of the hat matrix, we provide some theoretical results concerning this matrix. We also propose novel stopping rules. An important contribution of this paper is a method for computing the df efficiently when big data (nâ¯â«â¯pn) are processed. The significance of the auxiliary variables appearing in MPA for big data is clarified via a theoretical analysis. The superiority of the new stopping rules in comparison with the traditional approaches is demonstrated in simulations involving big data (nâ¯â«â¯pn) or overcomplete dictionaries (nâ¯<â¯pn) and in experiments with air pollution data.

Keywords

Hat matrix Information theoretic criteria Big Data