Sequential Interval Network for parsing complex structured activity

Article ID	Journal	Published Year	Pages	File Type
527294	Computer Vision and Image Understanding	2016	12 Pages	PDF

Abstract

•SIN parses structured activity sequence (or time series data in general).•SIN is not a time sliced graphical model.•SIN is equivalent to a left-right segmental model (HSMM) and allows exact inference.

We propose a new graphical model, called a Sequential Interval Network (SIN), for parsing complex, structured activities whose composition can be represented by a stochastic grammar. By exploiting the grammar, the generated network captures an activity’s global temporal structure while avoiding a time-sliced manner model. In this network, the hidden variables are the start and end times of the component actions, which allows reasoning about duration and observation on interval/segment level. Exact inference can be achieved and yield the posterior probabilities of the timing variables as well as each frame’s component label. Importantly, by using uninformative expected value of future observations, the network can predict the probability distribution of the timing of future component actions. We demonstrate this framework on vision tasks such as recognition and temporally segmentation of action sequence, or parsing and making future prediction online when running in streaming mode while observing an assembly task.

Keywords

Stochastic context-free grammar Action recognition Activity prediction