کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4942941 | 1437615 | 2018 | 12 صفحه PDF | دانلود رایگان |
- Manifold learning techniques are used to construct improved image background models.
- Random uniform sampling of disjoint image neighborhoods yields background sample.
- Detection statistic is distance of remaining neighborhoods from background manifold.
- Performance versus parameters like kernel bandwidth and sampling density is tested.
- Kernel PCA beats diffusion map and benchmark RX on maritime anomaly detection task.
Appropriately identifying outlier data is a critical requirement in the decision-making process of many expert and intelligent systems deployed in a variety of fields including finance, medicine, and defense. Classical outlier detection schemes typically rely on the assumption that normal/background data of interest are distributed according to an assumed statistical model and search for data that deviate from that assumption. However, it is frequently the case that performance is reduced because the underlying distribution does not follow the assumed model. Manifold learning techniques offer improved performance by learning better models of the background but can be too computationally expensive due to the need to calculate a distance measure between all data points. Here, we study a general framework that allows manifold learning techniques to be used for unsupervised anomaly detection by reducing computational expense via a uniform random sampling of a small fraction of the data. A background manifold is learned from the sample and then an out-of-sample extension is used to project unsampled data into the learned manifold space and construct an anomaly detection statistic based on the prediction error of the learned manifold. The method works well for unsupervised anomaly detection because, by definition, the ratio of anomalous to non-anomalous data points is small and the sampling will be dominated by background points. However, a variety of parameters that affect detection performance are introduced so we use here a low-dimensional toy problem to investigate their effect on the performance of four learning algorithms (kernel PCA, two versions of diffusion map, and the Parzen density estimator). We then apply the methods to the detection of watercraft in an ensemble of 22 infrared maritime scenes where we find kernel PCA to be superior and show that it outperforms a commonly employed baseline algorithm. The framework is not limited to the tested image processing example and can be used for any unsupervised anomaly detection task.
Journal: Expert Systems with Applications - Volume 91, January 2018, Pages 374-385