Automated Data Cleaning in Large Databases Using Machine Learning Methods

Hajar Maseeh Yasin *

IT Department, College of Informatics, Akre University for Applied Sciences, Iraq.

Aso Kareem Khorsheed

IT Department, College of Informatics, Akre University for Applied Sciences, Iraq.

*Author to whom correspondence should be addressed.


Abstract

The paper discusses the need for effective data cleaning processes to ensure the accuracy and reliability of datasets in machine learning and big data analytics due to the growing volume and complexity of data. Traditional manual cleaning methods are often inefficient and error-prone, compromising data quality. It explores automated techniques that utilize machine learning, particularly integrating supervised and unsupervised learning algorithms, to enhance data preparation efficiency. The study shows that these advanced methods can significantly improve data quality, reduce preparation time, and support better decision-making. Ultimately, it emphasizes the importance of robust data cleansing frameworks for effectively harnessing big data's potential and improving model performance in various applications.

Keywords: Data cleaning, machine learning, big data, data quality, automation, supervised learning, unsupervised learning, efficienc, decision-making data integration


How to Cite

Yasin, Hajar Maseeh, and Aso Kareem Khorsheed. 2025. “Automated Data Cleaning in Large Databases Using Machine Learning Methods”. Asian Journal of Research in Computer Science 18 (5):364-86. https://doi.org/10.9734/ajrcos/2025/v18i5661.

Downloads

Download data is not yet available.