کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
408217 | 679008 | 2016 | 7 صفحه PDF | دانلود رایگان |
Deep neural networks have recently shown impressive performance in several machine learning tasks. An important approach to training deep networks, useful especially when labeled data is scarce, relies on unsupervised pretraining of hidden layers followed by supervised finetuning. One of the most widely used approaches to unsupervised pretraining is to train each layer with the Contrastive Divergence (CD) algorithm. In this work we present a modification to CD with the goal of learning more diverse sets of features in hidden layers. In particular, we extend the CD learning rule to penalize cosines of the angles between weight vectors, which in turn encourages orthogonality between the learned features. We demonstrate experimentally that this extension to CD improves performance of pretrained deep networks on image recognition and document retrieval tasks.
Journal: Neurocomputing - Volume 202, 19 August 2016, Pages 84–90