Predictive Data Analysis in Forecasting Patient Health Outcomes Using Machine Learning Algorithms
Ademola Oluwaseun Salako
*
Sam Houston State University, 1905 University Ave, Huntsville, TX 77340, United States of America.
*Author to whom correspondence should be addressed.
Abstract
This study investigates the effectiveness of machine learning algorithms, particularly an optimized eXtreme Gradient Boosting (XGBoost) model, in predicting 30-day ICU readmissions using the MIMIC-III dataset comprising over 40,000 critical care records. Through a quantitative approach, supervised learning models were developed, evaluated, and compared, incorporating hyperparameter tuning via grid search with five-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed for interpretability and to identify key predictors. Results showed that the optimized XGBoost achieved an AUC-ROC of 0.892 and F1-score of 0.826, outperforming Logistic Regression and Random Forest. Integration simulations revealed a 3.2-second latency, 89% success rate, and 0.18 workflow disruption index, validating real-time deployment potential. Recommendations include enforcing interpretability standards, enhancing EHR interoperability, promoting clinician algorithm literacy, and ensuring dataset representativeness for predictive equity. These findings highlight the pivotal role of interpretable AI in supporting proactive, equitable, and data-driven clinical decision-making.
Keywords: Predictive analytics, ICU readmission, XGBoost, SHAP, MIMIC-III