Dual Rectified Linear Units (DReLUs): A replacement for tanh activation functions in Quasi-Recurrent Neural Networks

Article ID	Journal	Published Year	Pages	File Type
10146098	Pattern Recognition Letters	2018	11 Pages	PDF

Abstract

We independently reproduce the QRNN experiments of Bradbury etÂ al. [1] and compare our DReLU-based QRNNs with the original tanh-based QRNNs and Long Short-Term Memory networks (LSTMs) on sentiment classification and word-level language modeling. Additionally, we evaluate on character-level language modeling, showing that we are able to stack up to eight QRNN layers with DReLUs, thus making it possible to improve the current state-of-the-art in character-level language modeling over shallow architectures based on LSTMs.

Keywords

RELU Activation functions Recurrent neural networks Language modeling