Authors:
D. Ujwal, Manjula Sanjay Koti, Rejwan Bin Sulaiman
Addresses:
Department of Computer Applications, Dayananda Sagar Academy of Technology and Management, Bangalore, Karnataka, India. Department of Computer Science and Technology, Northumbria University, Newcastle upon Tyne, England, United Kingdom.
Abstract:
Cyberbullying occurs frequently online, and it should be properly identified and resolved. In this research, a machine learning method for Cyberbullying Identification (CI) is created and tested using gradient boosting. Using Kaggle's "Social Media Cyberbullying Corpus," which contains thousands of labelled web postings, our model was trained to identify cyberbullying and regular web activity. Key programming tools for this task include the Python Scikit-learn library for running the model, Pandas for data manipulation, and NLTK for text manipulation. The deployed model could achieve 80% accuracy with strong performance after extensive training and hyperparameter fine-tuning. Apart from this, to implement the above model sustainably, a web application developed using the Flask framework detects cyberbullying in real-time from text input. This contribution to other people's work is that this paper is empirically based and a useful tool for detecting abusive online behaviour, thereby enabling earlier intervention by agents such as social media companies and teachers. Future work will compare the model's scalability across platforms and languages to achieve an optimal fit and extrapolate it to various online contexts, given the dynamic and multidimensional nature of cyberbullying.
Keywords: Cyberbullying Identification (CI); Gradient Booster; Hyperparameter Fine-Tuning; Flask Framework; Machine Learning; Cyberbullying Detection; Natural Language Toolkit (NLTK).
Received: 26/07/2024, Revised: 10/09/2024, Accepted: 22/10/2024, Published: 03/06/2025
DOI: 10.64091/ATICL.2025.000150
AVE Trends in Intelligent Computer Letters, 2025 Vol. 1 No. 2 , Pages: 95-103