Explainable AI for Breast Cancer Diagnosis: Comparative Analysis of ML Models Using Random Forest Feature Selection and SHAP Interpretability

John Kamwele Mutinda; Tecla Mutave Kyalo; Joyce Akhalakwa Mukolwe; Jackson Ndoto Munyao; Millicent Auma Omondi; Wycliffe Nzoli Nzomo; Titus Mutua Kioko; David Chepkonga; Samuel Kipsang Kaptum; Erick Munala Sifuna; Amos Kipkorir Langat

doi:10.9734/ajrcos/2025/v18i10762

Explainable AI for Breast Cancer Diagnosis: Comparative Analysis of ML Models Using Random Forest Feature Selection and SHAP Interpretability

Full Article - PDF Review History Discussion

Published: 2025-10-03

DOI: 10.9734/ajrcos/2025/v18i10762

Page: 30-46

Issue: 2025 - Volume 18 [Issue 10]

John Kamwele Mutinda *

African Institute for Mathematical Sciences, Senegal.

Tecla Mutave Kyalo

African Institute for Mathematical Sciences, Cameroon.

Joyce Akhalakwa Mukolwe

African Institute for Mathematical Sciences, Cameroon.

Jackson Ndoto Munyao

African Institute for Mathematical Sciences, Cameroon.

Millicent Auma Omondi

African Institute for Mathematical Sciences, Cameroon.

Wycliffe Nzoli Nzomo

African Institute for Mathematical Sciences, Senegal.

Titus Mutua Kioko

University of Embu, Kenya.

David Chepkonga

Jomo Kenyatta University of Agriculture and Technology, Kenya.

Samuel Kipsang Kaptum

Jomo Kenyatta University of Agriculture and Technology, Kenya.

Erick Munala Sifuna

Jomo Kenyatta University of Agriculture and Technology, Kenya.

Amos Kipkorir Langat

Jomo Kenyatta University of Agriculture and Technology, Kenya.

*Author to whom correspondence should be addressed.

Abstract

Breast cancer diagnosis is critical for improving patient outcomes, yet traditional methods face limitations such as invasiveness and human error. This study presents an explainable AI framework for breast cancer classification using six ML models: LR, NB, KNN, RF, SVC, and DT. SMOTE addresses class imbalance, while RF feature selection reduces dimensionality from 30 to 19 features. SHAP interpretability is integrated to provide clinical insights into feature contributions, enhancing trust in model predictions. The SVC model with RF-selected features achieves superior performance, with an accuracy of 0.9930 and recall of 1.0000, highlighting the importance of features such as smoothness mean. This framework balances accuracy, efficiency, and transparency, offering a foundation for clinical deployment and guiding future work on external validation and broader adoption of explainable ML in breast cancer care.

Keywords: Breast cancer diagnosis, machine learning, random forest, SHAP, feature selection, explainable AI

How to Cite

Mutinda, John Kamwele, Tecla Mutave Kyalo, Joyce Akhalakwa Mukolwe, Jackson Ndoto Munyao, Millicent Auma Omondi, Wycliffe Nzoli Nzomo, Titus Mutua Kioko, et al. 2025. “Explainable AI for Breast Cancer Diagnosis: Comparative Analysis of ML Models Using Random Forest Feature Selection and SHAP Interpretability”. Asian Journal of Research in Computer Science 18 (10):30-46. https://doi.org/10.9734/ajrcos/2025/v18i10762.

Downloads

Download data is not yet available.