Article ID Journal Published Year Pages File Type
404661 Neural Networks 2008 11 Pages PDF
Abstract

Multi-stage feed-forward neural network (NN) learning with sigmoidal-shaped hidden-node functions is implicitly constrained optimization featuring negative curvature. Our analyses on the Hessian matrix H of the sum-squared-error measure highlight the following intriguing findings: At an early stage of learning, H tends to be indefinite and much better-conditioned than the Gauss–Newton Hessian JTJ. The NN structure influences the indefiniteness and rank of H. Exploiting negative curvature leads to effective learning. All these can be numerically confirmed owing to our stagewise second-order backpropagation; the systematic procedure exploits NN’s “layered symmetry” to compute H efficiently, making exact Hessian evaluation feasible for fairly large practical problems.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,