Abstract:
This study examines the effectiveness of machine learning algorithms in predicting default events among non-financial firms listed in Pakistan, focusing on the selection of efficient financial predictors. Utilizing a comprehensive dataset of 71 financial ratios from 396 firms over a 24-year period (2000–2023), the research compares the performance of ten classification models, including Logistic Regression, K-Nearest Neighbors, Naive Bayes, Decision Trees, Support Vector Machines, Artificial Neural Networks, XGBoost, LightGBM, and CatBoost. The results indicate that profitability and liquidity ratios are the most influential indicators of default risk, with LightGBM achieving the highest accuracy (89.70%), followed closely by CatBoost (89.49%) and XGBoost (89.22%). These findings offer valuable insights for financial professionals and policymakers, emphasizing the importance of targeted financial metrics and advanced analytics in credit risk assessment. The study contributes to the literature by applying a robust comparative approach across diverse algorithms in an emerging market context, thereby strengthening the framework for financial decision-making and risk mitigation.
Keywords: Feature Selection, Machine Learning, Default Prediction, Risk Assessment, Binary Classification