Article ID Journal Published Year Pages File Type
434750 Theoretical Computer Science 2013 17 Pages PDF
Abstract

In response to a 1997 problem of M. Vidyasagar, we state a criterion for PAC learnability of a concept class C under the family of all non-atomic (diffuse) measures on the domain Ω. The uniform Glivenko–Cantelli property with respect to non-atomic measures is no longer a necessary condition, and consistent learnability cannot in general be expected. Our criterion is stated in terms of a combinatorial parameter which we call the VC dimension of C modulo countable sets. The new parameter is obtained by “thickening up” single points in the definition of VC dimension to uncountable “clusters”. Equivalently, if and only if every countable subclass of C has VC dimension ≤d outside a countable subset of Ω. The new parameter can be also expressed as the classical VC dimension of C calculated on a suitable subset of a compactification of Ω. We do not make any measurability assumptions on C, assuming instead the validity of Martin’s Axiom (MA). Similar results are obtained for function learning in terms of the fat-shattering dimension modulo countable sets, but, just like in the classical distribution-free case, the finiteness of this parameter is sufficient but not necessary for PAC learnability under non-atomic measures.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics