Machine Learning Models for Predicting Parkinson’s Disease Progression Using Longitudinal Data: A Systematic Review
Oluwabamise J. Adeniyi *
Computer Science Department, Babcock University, Ilishan-Remo, Ogun State, Nigeria.
Folasade Y. Ayankoya
Computer Science Department, Babcock University, Ilishan-Remo, Ogun State, Nigeria.
Shade O. Kuyoro
Computer Science Department, Babcock University, Ilishan-Remo, Ogun State, Nigeria.
Bright G. Akwaronwu
Computer Science Department, Babcock University, Ilishan-Remo, Ogun State, Nigeria.
Ayodeji G. Abiodun
Computer Science Department, Babcock University, Ilishan-Remo, Ogun State, Nigeria.
*Author to whom correspondence should be addressed.
Abstract
This systematic review aims to evaluate the effectiveness of various machine learning models in predicting PD progression using longitudinal data. Despite the increasing use of ML in PD research, gaps remain in understanding the impact of longitudinal data on prediction accuracy and model generalizability. This study aims to bridge this gap by examining how multimodal data sources, including clinical, genetic, and imaging datasets, contribute to improved predictive performance. The review focuses on the types of models used, data sources, performance metrics, and their potential to improve personalized treatment and clinical decision-making. A comprehensive literature search was conducted across Scopus, PubMed, Google Scholar, and ResearchGate to identify relevant studies published from January 2010 to February 2024. The inclusion criteria focused on studies employing ML techniques for analyzing longitudinal PD data, yielding 14 eligible studies. Data were extracted on ML models used, dataset characteristics, performance metrics, and the integration of multimodal data sources such as clinical, genetic, and imaging data. The findings were synthesized to assess model performance and generalizability. Long Short-Term Memory (LSTM) and ensemble methods like Random Forest and Light Gradient Boosting Machine (LGBM) are effective in capturing disease progression with high accuracy and robust performance metrics. LSTM models achieved accuracies up to 90% and AUC scores of 93.79%, while LGBM models achieved 90.73% and AUC of 94.57%. The Matthews Correlation Coefficient (MCC) scores in longitudinal studies increased over time, and Mean Absolute Error (MAE) also improved. Integrating multimodal data, including clinical, genetic, and imaging information, further improved model reliability and generalizability. ML models, particularly those incorporating longitudinal and multimodal data, show promise in predicting PD progression. Future research should prioritize dataset diversity, enhance model interpretability, and leverage real-world wearable data for improved clinical applicability.
Keywords: Parkinson’s disease, machine learning, longitudinal data, disease progression, predictive modeling