Article ID Journal Published Year Pages File Type
396899 Information Systems 2013 24 Pages PDF
Abstract

Wireless sensor networks are becoming increasingly popular for a variety of applications. Users are frequently faced with the surprising discovery that readings produced by the sensing elements of their motes are often contaminated with outliers. Outlier readings can severely affect applications that rely on timely and reliable sensory data in order to provide the desired functionality. As a consequence, there is a recent trend to explore how techniques that identify outlier values based on their similarity to other readings in the network can be applied to sensory data cleaning. Unfortunately, most of these approaches incur an overwhelming communication overhead, which limits their practicality. In this paper we introduce an in-network outlier detection framework, based on locality sensitive hashing, extended with a novel boosting process as well as efficient load balancing and comparison pruning mechanisms. Our method trades off bandwidth for accuracy in a straightforward manner and supports many intuitive similarity metrics. Our experiments demonstrate that our framework can reliably identify outlier readings using a fraction of the bandwidth and energy that would otherwise be required.

► We present TACO, an outlier detection framework that trades bandwidth for accuracy in a straightforward manner and supports various similarity measures. ► We present an extensive theoretical study on the tradeoffs occurring between bandwidth and accuracy during TACO's operation. ► We devise a boosting process that improves TACO's accuracy under no additional communication costs. ► We devise novel load balancing and comparison pruning mechanisms, which alleviate processing and communication load. ► We present a detailed experimental analysis of our techniques for a variety of data sets and parameter settings.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,