کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
455953 695609 2013 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation
چکیده انگلیسی

Phishing is an attempt to steal users' personal and financial information such as passwords, social security and credit card numbers, via electronic communication such as e-mail and other messaging services. Attackers pretend to be from a legitimate organization and direct users to a fake website that resembles a legitimate website, which is then used to collect users' personal information. In this paper, we propose a novel methodology to detect phishing attacks and to discover the entity/organization that the attackers impersonate during phishing attacks. The proposed multi-stage methodology employs natural language processing and machine learning. The methodology first discovers (i) named entities, which includes names of people, organizations, and locations; and (ii) hidden topics, using (a) Conditional Random Field (CRF) and (b) Latent Dirichlet Allocation (LDA) operating on both phishing and non-phishing data. Utilizing topics and named entities as features, the next stage classifies each message as phishing or non-phishing using AdaBoost. For messages classified as phishing, the final stage discovers the impersonated entity using CRF. Experimental results show that the phishing classifier detects phishing attacks with no misclassification when the proportion of phishing emails is less than 20%. The F-measure obtained was 100%. Our approach also discovers the impersonated entity from messages that are classified as phishing, with a discovery rate of 88.1%. The automatic discovery of impersonated entity from phishing helps the legitimate organization to take down the offending phishing site. This protects their users from falling for phishing attacks, which in turn leads to satisfied customers. Automatic discovery of an impersonated entity also helps email service providers to collaborate with each other to exchange attack information and protect their customers.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Security - Volume 34, May 2013, Pages 123–139
نویسندگان
, ,