A data mining framework for detecting subscription fraud in telecommunication

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
381483	1437477	2011	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Fraud detection - تشخیص تقلب Data mining - داده‌کاوی Decision tree - درخت تصمیم Neural networks - شبکه های عصبی Support vector machines - ماشین بردار پشتیبانی Ensembles - گروه ها

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

A data mining framework for detecting subscription fraud in telecommunication

چکیده انگلیسی

Service providing companies including telecommunication companies often receive substantial damage from customers’ fraudulent behaviors. One of the common types of fraud is subscription fraud in which usage type is in contradiction with subscription type. This study aimed at identifying customers’ subscription fraud by employing data mining techniques and adopting knowledge discovery process. To this end, a hybrid approach consisting of preprocessing, clustering, and classification phases was applied, and appropriate tools were employed commensurate to each phase. Specifically, in the clustering phase SOM and K-means were combined, and in the classification phase decision tree (C4.5), neural networks, and support vector machines as single classifiers and bagging, boosting, stacking, majority and consensus voting as ensembles were examined. In addition to using clustering to identify outlier cases, it was also possible – by defining new features – to maintain the results of clustering phase for the classification phase. This, in turn, contributed to better classification results. A real dataset provided by Telecommunication Company of Tehran was applied to demonstrate the effectiveness of the proposed method. The efficient use of synergy among these techniques significantly increased prediction accuracy. The performance of all single and ensemble classifiers is evaluated based on various metrics and compared by statistical tests. The results showed that support vector machines among single classifiers and boosted trees among all classifiers have the best performance in terms of various metrics. The research findings show that the proposed model has a high accuracy, and the resulting outcomes are significant both theoretically and practically.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 24, Issue 1, February 2011, Pages 182–194

نویسندگان

Hamid Farvaresh, Mohammad Mehdi Sepehri,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A data mining framework for detecting subscription fraud in telecommunication

دسترسی سریع

ارتباط

English Website