Context tree selection: A unifying view

Article ID	Journal	Published Year	Pages	File Type
1155816	Stochastic Processes and their Applications	2011	19 Pages	PDF

Abstract

Context tree models have been introduced by Rissanen in [25] as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen’s algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning over-estimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006) [12], Galves et al. (2008) [18], Galves and Leonardi (2008) [17], Leonardi (2010) [22], refining asymptotic results of Bühlmann and Wyner (1999) [4] and Csiszár and Talata (2006) [9].

► We give a non-asymptotic analysis of two context tree estimators. ► We show how penalized maximum likelihood and algorithm Context are related. ► We prove improved bounds for the probability of over- and under-estimation. ► For over-estimation, no boundedness or loss-of-memory conditions are required. ► We show new martingale bounds for self-normalized deviations of empirical probabilities.

Keywords

Deviation inequalities Model selection Penalized maximum likelihood Bayesian information criterion