Article ID Journal Published Year Pages File Type
4956460 Journal of Systems and Software 2017 16 Pages PDF
Abstract

•A symbolic time series representation using repeating shapes as symbols is proposed.•Lower bounding similarity measure properties evaluated on URC datasets.•Electricity consumption dataset used to evaluate on very long, seasonal time series.•Reconstruction error and dimensionality reduction ability comparable to PAA.

Over the past years, many representations for time series were proposed with the main purpose of dimensionality reduction and as a support for various algorithms in the domain of time series data processing. However, most of the transformation algorithms are not directly applicable on streams of data but only on static collections of the data as they are iterative in their nature. In this work we propose a symbolic representation of time series along with a method for transformation of time series data into the proposed representation. As one of the basic requirements for applicable representation is the distance measure which would accurately reflect the true shape of the data, we propose a distance measure operating on the proposed representation and lower bounding the Euclidean distance on the original data. We evaluate properties of the proposed representation and the distance measure on the UCR collection of datasets. As we focus on stream data processing, we evaluate the properties and limitations of the proposed representation on very long time series from the domain of electricity consumption monitoring, simulating the processing of potentially unbound data stream.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, ,