An Optimized Machine Learning Framework for Financial Fraud Detection Using SMOTE and Feature Selection

Dalia Abdulrahim Mokheef  Aljabri

Authors

Dalia Abdulrahim Mokheef Aljabri Department of Mathematics, College of Basic Education, University of Babylon, Babil, Iraq. Author

Keywords:

Machine Learning, Credit Card Fraud Detection, Random Forest, SMOTE, Feature Selection

Abstract

Financial fraud detection remains a challenging task because fraudulent transactions represent only a very small portion of financial data, making traditional classification approaches less reliable. This study presents an optimized machine learning framework that combines feature scaling, feature selection, and the Synthetic Minority Oversampling Technique (SMOTE) within a unified experimental pipeline to address class imbalance and improve fraud detection performance. Three supervised learning models, namely Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN), were evaluated using a publicly available credit card transaction dataset containing 284,807 records, including 492 fraudulent transactions (0.17%). Model performance was assessed using Accuracy, Precision, Recall, F1-score, and ROC-AUC. The experimental results showed that the Random Forest model achieved the best overall performance, with an accuracy of 99.96%, precision of 97%, recall of 91%, F1-score of 94%, and a ROC-AUC value of 0.98. The results also indicate that combining SMOTE with feature selection improves the ability of the models to identify minority fraudulent transactions while reducing the tendency toward majority-class predictions. These findings highlight the practical value of integrating preprocessing optimization techniques with machine learning methods to support more reliable fraud detection systems in modern financial environments.