On the Solutions of the Big Data Timeliness Problem
Ramy Ebeid *
Department of Information System, Madina Academy, Cairo, Egypt.
Ahmed Salem
Arab Academy for Science, Technology and Maritime Transport, Cairo, Egypt.
M. B. Senousy
College of Computing and Information Technology, Sadat Academy for Management Sciences, Technology, Cairo, Egypt.
*Author to whom correspondence should be addressed.
Abstract
Big Data is increasingly used on almost the entire planet, both online and offline. It is not related only to computers. It makes a new trend in the decision-making process and the analysis of this data will predict the results based on the explored knowledge of big data using Clustering algorithms. The response time of performance and speed presents an important challenge to classify this monstrous data. K-means and big k-mean algorithms solve this problem. In this paper, researcher find the best K value using the elbow method, then use two ways in the first sequential processing and the second is parallel processing, then apply the K-mean algorithm and the big K-mean on shared memory to make a comparative study find which one is the best in different data sizes. The analysis performed by R studio environment.
Keywords: Big data, k means, big k means, sequential processing, parallel processing