Article ID Journal Published Year Pages File Type
10359461 Image and Vision Computing 2014 48 Pages PDF
Abstract
This paper presents an unsupervised deep learning framework that derives spatio-temporal features for human-robot interaction. The respective models extract high-level features from low-level ones through a hierarchical network, viz. the Hierarchical Temporal Memory (HTM), providing at the same time a solution to the curse of dimensionality in shallow techniques. The presented work incorporates the tensor-based framework within the operation of the nodes and, thus, enhances the feature derivation procedure. This is due to the fact that tensors allow the preservation of the initial data format and their respective correlation and, moreover, attain more compact representations. The computational nodes form spatial and temporal groups by exploiting the multilinear algebra and subsequently express the samples according to those groups in terms of proximity. This generic framework may be applied in a diverse of visual data, while it has been examined on sequences of color and depth images, exhibiting remarkable performance.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,