Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
552369 | Decision Support Systems | 2016 | 10 Pages |
•Exploit URL patterns based on the proposed N-gram model to extract identity keywords•Attain robustness in detecting phishing webpages hosted in any language•Offers long-term effectiveness by leveraging on permanent phishing characteristic•Achieve higher accuracy in finding target identity by using compromise programming•Suppress false positives by exploiting indirect identity relationships
This paper proposes a phishing detection technique based on the difference between the target and actual identities of a webpage. The proposed phishing detection approach, called PhishWHO, can be divided into three phases. The first phase extracts identity keywords from the textual contents of the website, where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase finds the target domain name by using a search engine, and the target domain name is selected based on identity-relevant features. In the final phase, a 3-tier identity matching system is proposed to determine the legitimacy of the query webpage. The overall experimental results suggest that the proposed system outperforms the conventional phishing detection methods considered.