Article ID Journal Published Year Pages File Type
536160 Pattern Recognition Letters 2016 8 Pages PDF
Abstract

•Original scenario of a face-to-face collaborative task.•Data-driven approach to model multimodal human behavior.•An original learned structure for a dynamic Bayesian network.•The complex relationships between the different modalities of the human behavior.•New evaluation of models’ ability to reproduce how a human coordinates his behavior.

The goal of this paper is to model the coverbal behavior of a subject involved in face-to-face social interactions. For this end, we present a multimodal behavioral model based on a dynamic Bayesian network (DBN). The model was inferred from multimodal data of interacting dyads in a specific scenario designed to foster mutual attention and multimodal deixis of objects and places in a collaborative task. The challenge for this behavioral model is to generate coverbal actions (gaze, hand gestures) for the subject given his verbal productions, the current phase of the interaction and the perceived actions of the partner. In our work, the structure of the DBN was learned from data, which revealed an interesting causality graph describing precisely how verbal and coverbal human behaviors are coordinated during the studied interactions. Using this structure, DBN exhibits better performances compared to classical baseline models such as hidden Markov models (HMMs) and hidden semi-Markov models (HSMMs). We outperform the baseline in both measures of performance, i.e. interaction unit recognition and behavior generation. DBN also reproduces more faithfully the coordination patterns between modalities observed in ground truth compared to the baseline models.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,