Video Game Sales Success Using Random Forest Based Machine Learning Techniques

Marwa Al-Hadi *

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Hiba ALMarwi

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Abdulrahman Alsabri

Department of Information system, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Idrees Hajar

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Ahmed Al-Kataby

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Hussam Al-Maswari

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Ali Almansor

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Ashraf Alshujaa

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Ibrahim Al-Zubaidi

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Osama Dammag

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Ahmed Alghawri

Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.

Abdullah Amer

Department of Computer Science, Faculty of Computer Science and Information Technology, Aden University, Aden, Yemen.

*Author to whom correspondence should be addressed.


Abstract

Background: Accurately predicting video game sales remains a challenging task due to the complex interaction of multiple factors, including game genre, platform, publisher reputation, release timing, and market competition. Traditional forecasting approaches often rely on historical averages or manual analysis, which may fail to capture nonlinear relationships within large and heterogeneous datasets.

Aims: The present study develops a machine learning-based model for predicting video game sales success and to evaluate its effectiveness in classifying games into high and low sales categories.

Study Design: An experimental study based on supervised machine learning classification techniques.

Methodology: A publicly available dataset containing video game attributes such as genre, platform, publisher, and global sales was utilized. Data preprocessing included data cleaning, removal of irrelevant features, categorical encoding, and normalization. Class imbalance was addressed using oversampling techniques. Feature selection was performed using the chi-square test to identify the most relevant predictors. The problem was formulated as a binary classification task by defining a target variable representing high and low sales categories. The proposed model was evaluated and compared with baseline classifiers under the same experimental conditions.

Results: The proposed machine learning model achieved an accuracy of 78%, with balanced precision and recall values. Comparative evaluation showed that the proposed approach outperformed baseline models, including Support Vector Machine and Logistic Regression, across all performance metrics. The model demonstrated strong capability in identifying high-sales games with improved classification reliability.

Conclusion: Machine learning techniques provide an effective approach for predicting video game sales success and can support data-driven decision-making in the gaming industry. The proposed framework improves classification performance through structured preprocessing and feature selection; however, further studies incorporating additional external factors may enhance prediction accuracy.

Keywords: Machine learning, random forest, video game sales prediction, classification, data preprocessing, imbalanced data


How to Cite

Al-Hadi, Marwa, Hiba ALMarwi, Abdulrahman Alsabri, Idrees Hajar, Ahmed Al-Kataby, Hussam Al-Maswari, Ali Almansor, et al. 2026. “Video Game Sales Success Using Random Forest Based Machine Learning Techniques”. Asian Journal of Research in Computer Science 19 (4):133-45. https://doi.org/10.9734/ajrcos/2026/v19i4854.

Downloads

Download data is not yet available.