A Survey of Data Mining Activities in Distributed Systems
Waleed A. Mohammad *
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Hajar Maseeh Yasin
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Azar Abid Salih
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Adel AL-Zebari
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Naaman Omar
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Karwan Jameel Merceedi
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Abdulraheem Jamil Ahmed
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Nareen O. M. Salim
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Sheren Sadiq Hasan
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Shakir Fattah Kak
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
Ibrahim Mahmood Ibrahim
Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq.
*Author to whom correspondence should be addressed.
Abstract
Distributed systems, which may be utilized to do computations, are being developed as a result of the fast growth of sharing resources. Data mining, which has a huge range of real applications, provides significant techniques for extracting meaningful and usable information from massive amounts of data. Traditional data mining methods, on the other hand, suppose that the data is gathered centrally, stored in memory, and is static. Managing massive amounts of data and processing them with limited resources is difficult. Large volumes of data, for instance, are swiftly generated and stored in many locations. This becomes increasingly costly to centralize them at a single location. Furthermore, traditional data mining methods typically have several issues and limitations, such as memory restrictions, limited processing ability, and insufficient hard drive space, among others. To overcome the following issues, distributed data mining's have emerged as a beneficial option in several applications According to several authors, this research provides a study of state-of-the-art distributed data mining methods, such as distributed common item-set mining, distributed frequent sequence mining, technical difficulties with distributed systems, distributed clustering, as well as privacy-protection distributed data mining. Furthermore, each work is evaluated and compared to the others.
Keywords: Distributed Systems, Data Mining, Parallel Platform, Distributed Clustering