FORA: An OWO based framework for finding outliers in web usage mining

Article ID	Journal	Published Year	Pages	File Type
10151499	Information Fusion	2019	30 Pages	PDF

Abstract

Handling outliers are one of the primary concerns of today's data mining techniques. The concept of outliers, it's handling, and diagnosis is context specific and varies according to the field of application. The existence of outliers while mining web data is inevitable by virtue of unique characteristic features exhibited by a typical web user. As the output of a regression algorithm is always different from the actual value, it poses a challenge to the knowledge workers and researchers about the notion of an outlier in such cases. In this paper, we propose to develop the concept of an outlier with respect to regression analysis of any Web-based dataset. A framework to find outliers in the output of a regression algorithm is being formulated with the help of Ordered Weighted operators. The underlying idea is to find an error rectification value, Ïµ, that will work, in association with the predicted value from the regression model and then help to distinguish an outlier. This will, in addition, also provide a possible range of deviation from the predicted output. A case study on a web dataset is being done to show the usefulness of the proposed approach.

Keywords

Outliers Regression Fuzzy logic Business intelligence World Wide Web