Learning long-term dependencies with recurrent neural networks

Article ID	Journal	Published Year	Pages	File Type
409061	Neurocomputing	2008	8 Pages	PDF

Abstract

Recurrent neural networks (RNN) unfolded in time are in theory able to map any open dynamical system. Still, they are often blamed to be unable to identify long-term dependencies in the data. Especially when they are trained with backpropagation it is claimed that RNN unfolded in time fail to learn inter-temporal influences more than 10 time steps apart. This paper refutes this often cited statement by giving counter-examples. We show that basic time-delay RNN unfolded in time and formulated as state space models are indeed capable of learning time lags of at least a 100 time steps. We point out that they even possess a self-regularisation characteristic, which adapts the internal error backflow, and analyse their optimal weight initialisation. In addition, we introduce the idea of inflation for modelling of long- and short-term memory and demonstrate that this technique further improves the performance of RNN.

Keywords

Long-term dependencies Backpropagation Inflation Memory Recurrent neural networks System identification State space model