Training restricted Boltzmann machines: An introduction

Article ID	Journal	Published Year	Pages	File Type
532102	Pattern Recognition	2014	15 Pages	PDF

Abstract

•We review the state-of-the-art in training restricted Boltzmann machines (RBMs) from the perspective of graphical models.•Variants and extensions of RBMs are used in a wide range of pattern recognition tasks.•The required background on graphical models and Markov chain Monte Carlo methods is provided.•Theoretical and experimental results are presented.

Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.

Keywords

Restricted Boltzmann Machines Parallel tempering Markov random fields Markov chains Neural networks Gibbs sampling