The Security Challenges of Big Data Analytics: A Systematic Literature Review

Tewodrose Tilahun *

Department of Information Technology, Institute of Technology, Faculty of Informatics, Hawassa University, Hawassa, Ethiopia.

Solomon Tsegaye

Department of Information Technology, Institute of Technology, Faculty of Informatics, Hawassa University, Hawassa, Ethiopia.

*Author to whom correspondence should be addressed.


Abstract

The huge amount of data generated from heterogeneous sources such as social networking sites, healthcare applications, sensor networks and many other sources are drastically increasing from time to time swiftly.  Big Data is described as extremely large datasets that have grown beyond the capability to manage and analyze them with traditional database processing tools.  Big data analytics is the use of advanced analytical techniques against a very large heterogeneous datasets that include structured, semi-structured and unstructured data from different sources. The larger the quantity of data by itself is not advantageous unless analyzed to produce valuable information. This deluge amount of data creates an operational risk in which, the risks arise from storage devices, security of tools or the technologies used to analyze the data. In this paper, we perform a systematic literature review to give comprehensive review of security challenges and risks related to big data analytics. Security mechanisms such as cryptographic and non-cryptographic techniques are used to secure big data during analytics. The security of big data at rest and in transit gets enough investigation while a few researches had done at securing data at processing stage. Even though a number of possible techniques were proposed for big data security, it still suffers performance issues. This article is trying to explor security issues that used for preserving the Confidentiality, Integrity and Availability (CIA triad), non-repudiation as well as Access control in the context of big data analytics. Finally, we identify open future research directions for security of big data analytics. This paper also can serve as a good reference source for the development of modern security-preserving techniques to address various challenges of big data analytics security and privacy-issues.

Keywords: Big data analytics, homomorphic encryption, verifiable encryption, differential privacy


How to Cite

Tilahun, T., & Tsegaye, S. (2022). The Security Challenges of Big Data Analytics: A Systematic Literature Review. Asian Journal of Research in Computer Science, 14(4), 184–197. https://doi.org/10.9734/ajrcos/2022/v14i4303

Downloads

Download data is not yet available.

References

Venkatraman S. Venkatraman AR. Big data security challenges and strategies. AIMS Mathematics; 2019.

Panimalar AS, Shree VS, Kathrine V. A, The 17 V’s Of Big Data, International Research Journal of Engineering and Technology (IRJET). 2017;04(09).

Kumar V, Kumar R, Pandey SK, Alam M. Fully Homomorphic Encryption Scheme with Probabilistic Encryption Based on Euler's Theorem and Application in Cloud Computing, Advances in Intelligent System and Computing; 2018.

Anwar MJ, Gill AQ, Hussain FK, Imran M. Secure big data ecosystem architecture challenges and solutions, Journal on wirless commninication and Networking; 2021.

Hallman RA, Diallo MH, August MA, Graves CT. Homomorphic Encryption for Secure Computation on Big Data, in the 3rd International Conference on Internet of Things, Big Data and Security, San Diego, Ca, U.S.A; 2018.

Sidhu HJS, Khanna MS. Cloud’s Transformative Involvement in Managing BIG-DATA ANALYTICS For Securing Data in Transit, Storage And Use: A Study, in Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC); 2020.

Bajpai S, Srivastava P. A Fully Homomorphic Encryption Implementation on Cloud Computing, International Journal of Information & Computation Technology. 2014;8:811-816.

Tran HY, Hu J. Privacy-preserving big data analytics - A comprehensive survey, Journal of Parallel and Distributed Computing. 2019;134:207-218.

Jiale Zhang BCSYAHD. PEFL: A Privacy-Enhanced Federated Learning Scheme for Big Data Analytics. IEEE Global Communications Conference (GLOBE-COM); 2019.

Abouelmehd K, Beni-Hssane A, Khaloufi H, Saadi M. Big data security and privacy in healthcare: A Review, in Procedia Computer Science; 2017.

Cunha M, Mendes R, Vilela JP. A survey of privacy-preserving mechanisms for heterogeneous data types, Computer Science Review. 2021;41.

Papadimitriou A, Bhagwan R, Chandran N, Ramjee R, Haeberlen A, Singh H, Modi A, Badrinarayanan S. Big Data Analytics over Encrypted Datasets with Seabed," in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA; 2016.

Kitchenham B, Charters S. Guidelines for performing Systematic Literature Reviews in Software Engineering, Software Engineering Group School of Computer Science and Mathematics Keele University and Department of Computer Scienc University of Durham , Durham,UK; 2007.

Nalini C, Arunachalam AR. A Study on Privacy Preserving Techniques in Big Data Analytics, International Journal of Pure and Applied Mathematics. 2017;116(10):281-286.

Alabdulatif A, Khalil I, Yi X. Towards secure big data analytic for cloud-enabled applications with fully homomorphic encryption, Journal of Parallel and Distributed Computing; 2019.

Shoba V, Parameswari R. A Pragmatic Approach for Privacy Preserving Healthcare Using Stretched Homomorphic Re-Encryption Decryption Algorithm, International Journal of Advanced Science and Technology. 2020;20(7): 8850-8860.

Hong M. Homomorphic Encryption Scheme Based on Elliptic Curve Cryptography for Privacy Protection of Cloud Computing; 2016.

Fun TS, Samsudin A. A Survey of Homomorphic Encryption for Outsourced Big Data Computation, KSII Transactions on Internet and Information Systems. 2016;10(8):3826-3851.

Peralta G, Cid-Fuentes RG, Bilbao J, Crespo PM. Homomorphic Encryption and Network Coding in IoT Architectures: Advantages and Future Challenges, Electronics. 2019;8.

Lu R, Zhu H, Liu X, Liu JK, Shao J. Toward Efficient and Privacy-Preserving Computing in Big Data Era, IEEE; 2014.

Raeini MG, Nojoumian M. Privacy-Preserving Big Data Analytics: From Theory to Practice; 2019.

Sheikh R, Mishra DK. Secure Sum Computation Using Homomorphic Encryption, Springer Nature Singapore; 2019.

Yakoubov S, Gadepally V, Schear N, Shen E, Yerukhimovich A. A Survey of Cryptographic Approaches to Securing Big-Data Analytics in the Cloud, IEEE; 2014.

Faroukhi AZ, Alaouib IE, Gahia Y, Aminea A. A Multi-Layer Big Data Value Chain Approach for Security Issues, in The 2nd International Workshop on Emerging Networks and Communications, Leuven, Belgiu; 2020.

Narayanan U, Paul V, Joseph S. A novel system architecture for secure authentication and data sharing in cloud enabled Big Data Environment, Journal of King Saud University – Computer and Information Sciences; 2020.

Kaur P, Sharmab M, Mittal M. Big Data and Machine Learning Based Secure Healthcare Framework, in International Conference on Computational Intelligence and Data Science; 2018.

Bhathal GS, Singh A. Big Data: Hadoop framework vulnerabilities, security issues and attacks, Array. 2019;1(2).

Mothukuri V, Cheerla SS, Parizi RM, Zhang Q, Choo KKR. BlockHDFS: Blockchain-integrated Hadoop distributed file system for secure provenance traceability, Blockchain: Research and Applications. 2021;2.

Asif M, Abbas S, Khan M, Ftima A, Khan MA, Lee SW. MapReduce Based Intelligent Model for Intrusion Detection Using Machine Learning Technique, Journal of King Saud University - Computer and Information Sciences; 2021.

Nambiara S, Kalambur S, Sitaram D. Modeling Access Control on Streaming Data in Apache Storm, in Procedia Computer Science, Bengaluru, India; 2020.

Tawalbeh LA, Saldamli G. Reconsidering big data security and privacy in cloud and mobile cloud systems, Journal of King Saud University – Computer and Information Sciences. 2021;33:810– 819.

Laizhong Cui FRYAQY. When Big Data Meets Software-Defined Networking : SDN for Big Data and Big Data for SDN, IEEE Network; 2016.