Rizka Purwanto

Detecting Phishing Websites with AI

Abstract

Phishing is an essential component of various cyber-attacks as it is often used as the key step in advanced persistent threats. Despite the availability of public phishing detection toolbars and studies in phishing detection systems, the number of attacks has been increasing in the past years, due to the continuous change of attack models for bypassing detection. Thus, it remains a challenge to develop a robust phishing detection system despite the dynamic nature of phishing attacks.

To tackle this issue, a feature-free method for detecting phishing websites is proposed using the Normalised Compression Distance (NCD) which measures website similarity by compressing them, eliminating the need to perform any feature extraction nor any reliance on a specific set of website features. This method examines the HTML source codes of Web pages and computes their similarity with known phishing websites. It has been shown in the study that content-based methods can perform better phishing detection than image-based methods due to the robustness against website design modifications. In this approach, the use of the Furthest Point First algorithm to perform phishing prototype extractions is proposed, in order to select instances that are representative of a cluster of phishing web pages. We also introduce the use of an incremental learning algorithm as a framework for continuous and adaptive detection without re-performing new feature extraction when concept drift occurs and develop a general architecture for a phishing detection system, that can be deployed on an organisation’s email and web servers, and on cloud services.

Bio

Rizka Purwanto is currently a PhD student at the School of Computer Science and Engineering at the University of New South Wales (UNSW) and a recipient of the UNSW University International Postgraduate Award (UIPA) scholarship. Her research interests mainly include cybersecurity and artificial intelligence, specifically on how to make the Web more secure using deep learning or machine learning methods. Prior to her PhD study, she worked as a software engineer at Sense Health and Life Trading Sydney.