Article ID Journal Published Year Pages File Type
409151 Neurocomputing 2008 23 Pages PDF
Abstract

Generalization, in its most basic form, is an artificial neural network's (ANN's) ability to automatically classify data that were not seen during training. This paper presents a framework in which generalization in ANNs is quantified and different types of generalization are viewed as orders. The ordering of generalization is a means of categorizing different behaviours. These orders enable generalization to be evaluated in a detailed and systematic way. The approach used is based on existing definitions which are augmented in this paper. The generalization framework is a hierarchy of categories which directly aligns an ANN's ability to perform table look-up, interpolation, extrapolation, and hyper-extrapolation tasks.The framework is empirically validated. Validation is undertaken with three different types of regression task: (1) a one-to-one (o–o) task, f(x):xi→yj; (2) the second, in its f(x):{xi,xi+1, …}→yj formulation, maps a many-to-one (m–o) task; and (3) the third f(x):xi→{yj,yj+1, …} a one-to-many (o–m) task. The first and second are assigned to feedforward nets, while the third, due to its complexity, is assigned to a recurrent neural net.Throughout the empirical work, higher-order generalization is validated with reference to the ability of a net to perform symmetrically related or isomorphic functions generated using symmetric transformations (STs) of a net's weights. The transformed weights of a base net (BN) are inherited by a derived net (DN). The inheritance is viewed as the reuse of information. The overall framework is also considered in the light of alignment to neural models; for example, which order (or level) of generalization can be performed by which specific type of neuron model.The complete framework may not be applicable to all neural models; in fact, some orders may be special cases which apply only to specific neuron models. This is, indeed, shown to be the case. Lower-order generalization is viewed as a general case and is applicable to all neuron models, whereas higher-order generalization is a particular or special case. This paper focuses on initial results; some of the aims have been demonstrated and amplified through the experimental work.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,