Authors:
J. Angelin Jeba, S. Rubin Bose, O. Jeba Singh, R. Regin, S. Suman Rajest, Aditya Rautaray
Addresses:
Department of Electronics and Communication Engineering, S.A. Engineering College, Chennai, Tamil Nadu, India. School of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai, Tamil Nadu, India. Centre for Academic Research, Alliance University, Bengaluru, Karnataka, India. Department of Research and Development, Dhaanish Ahmed College of Engineering, Chennai, Tamil Nadu, India. Department of Cloud Solutions Security, CVS Healthcare, Ashburn, Virginia, United States of America.
Abstract:
This study proposes a machine learning–based framework for predicting heart disease using the UCI Heart Disease dataset. The objective is to develop an efficient predictive model that helps healthcare professionals identify individuals at risk early. The dataset, consisting of 303 patient records and 14 attributes, undergoes preprocessing steps, including normalization, categorical encoding, and feature selection, to enhance model performance. Multiple machine learning algorithms—Logistic Regression, Random Forests, Support Vector Machines (SVMs), and Artificial Neural Networks—are employed to classify patients into risk categories. The proposed methodology incorporates cross-validation and hyperparameter tuning to ensure robustness and reduce overfitting. Experimental evaluation is conducted using performance metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. The results demonstrate that Random Forest achieves the highest overall accuracy of 89%, outperforming traditional classifiers, while Neural Networks provide strong generalization capabilities with competitive results. A comparative analysis highlights the effectiveness of ensemble learning methods for handling nonlinear relationships in clinical datasets. This research underscores the potential of integrating machine learning models into clinical decision support systems to improve the reliability and speed of diagnosis. Future work includes incorporating larger, real-time datasets and extending deep learning architectures to improve predictive accuracy.
Keywords: Heart Disease; Logistic Regression; Artificial Neural Network; Support Vector Machine; Random Forests; Cardiovascular Disorders; Traditional Classifiers.
Received on: 09/04/2025, Revised on: 25/05/2025, Accepted on: 27/07/2025, Published on: 01/03/2026
DOI: 10.64091/ATIIR.2026.000278
Ave Trends in Intelligent Informatics Reports, 2026 Vol. 1 No. 1 , Pages: 24–36