Ensemble-Based Phishing Website Detection Using Extra Trees Classifier

Authors:
M. Arjun Raj , M. A. Thinesh, S. S. Mukhil Varmann, Avinash Reddy Pothu, P. Paramasivan   

Addresses:
Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai, Tamil Nadu, India, am0306@srmist.edu.in, tm9045@srmist.edu.in, sm7225@srmist.edu.in. Department of Research and Development, Ginger Labs, Texas, United States of America, Reddy0656@outlook.com. Department of Research and Development, Dhaanish Ahmed College of Engineering, Chennai, Tamil Nadu, India, paramasivanchem@gmail.com.

Abstract:

An attacker phishes victims to get their usernames, passwords, credit card numbers, or other personal information. Internet users are at risk from phishing attacks that steal personal data. Protecting against such dangers requires effective detection. Machine learning uses data-driven algorithms to detect phishing attempts and find patterns and abnormalities. Using the Extra Trees Classifier, the study investigates ensemble-based phishing website detection. A labelled dataset with phishing-related features trains and evaluates the proposed model. This research utilizes Kaggle’s dataset, which has 89 URL-, content-, network-, and statistical attributes. These traits help the model distinguish phishing websites from authentic ones. Explore and visualize these features to understand data distribution and feature relationships. The dataset is separated into training and testing datasets after visualization and used for model training and testing. The model is evaluated using ExtraTrees Classifier with 96.68% accuracy, 97.65% precision, 95.58% recall, and 96.6% F1 score. The project introduces a strong online user protection method based on machine learning for phishing detection. The project was developed using Google Collab.

Keywords: Machine Learning; URL-Based Features; Extra Trees Classifier; Ensemble Learning; Phishing Detection; Social Media; Training and Evaluating.

Received on: 17/02/2024, Revised on: 07/04/2024, Accepted on: 03/06/2024, Published on: 01/09/2024

AVE Trends in Intelligent Computing Systems, 2024 Vol. 1 No. 3, Pages: 142-156

  • Views : 211
  • Downloads : 16
Download PDF