Corpus-based Approaches for Sentiment Analysis: A Review
Mahmood Umar *
Department of Computer Science, Sokoto State University, Sokoto, Nigeria.
Hauwa Ibrahim Binji
Department of Computer Science, Sokoto State University, Sokoto, Nigeria.
Anas Tukur Balarabe
Department of Computer Science, Sokoto State University, Sokoto, Nigeria.
*Author to whom correspondence should be addressed.
Abstract
The investigation studied the state of art on corpus-based approaches for sentiment analysis. Thus, detailing its methodologies, evaluation metrics, limitations, and future directions. The importance of sentiment analysis in fields such as marketing, customer feedback analysis, social media monitoring, financial analysis, and political science is emphasized. The methodology for corpus-based approaches in sentiment analysis includes the following key steps: data collection, preprocessing, feature extraction, and sentiment classification. The lexicon-based approaches include the corpus-based or bag of words (BOW) and dictionary (also called opinion lexicon). Evaluation of the corpus-based sentiment analysis approach is addressed through performance metrics such as accuracy, precision, recall, F1-score, and comparative analysis with other approaches including hybrid and rule-based systems. Limitations of corpus-based sentiment analysis, such as data sparsity and domain adaptation, are acknowledged, alongside potential enhancements and research directions including ensemble learning, deep learning architectures, and multimodal data integration. The conclusion emphasizes the versatility and scalability of corpus-based sentiment analysis, while ongoing research efforts aim to address its limitations and further enhance its applicability in diverse domains.
Keywords: Corpus-based approach, sentiment analysis, NLP, feature extraction