Corpus-based Approaches for Sentiment Analysis: A Review

Mahmood Umar *

Department of Computer Science, Sokoto State University, Sokoto, Nigeria.

Hauwa Ibrahim Binji

Department of Computer Science, Sokoto State University, Sokoto, Nigeria.

Anas Tukur Balarabe

Department of Computer Science, Sokoto State University, Sokoto, Nigeria.

*Author to whom correspondence should be addressed.


The investigation studied the state of art on corpus-based approaches for sentiment analysis. Thus, detailing its methodologies, evaluation metrics, limitations, and future directions. The importance of sentiment analysis in fields such as marketing, customer feedback analysis, social media monitoring, financial analysis, and political science is emphasized. The methodology for corpus-based approaches in sentiment analysis includes the following key steps: data collection, preprocessing, feature extraction, and sentiment classification. The lexicon-based approaches include the corpus-based or bag of words (BOW) and dictionary (also called opinion lexicon). Evaluation of the corpus-based sentiment analysis approach is addressed through performance metrics such as accuracy, precision, recall, F1-score, and comparative analysis with other approaches including hybrid and rule-based systems. Limitations of corpus-based sentiment analysis, such as data sparsity and domain adaptation, are acknowledged, alongside potential enhancements and research directions including ensemble learning, deep learning architectures, and multimodal data integration. The conclusion emphasizes the versatility and scalability of corpus-based sentiment analysis, while ongoing research efforts aim to address its limitations and further enhance its applicability in diverse domains.

Keywords: Corpus-based approach, sentiment analysis, NLP, feature extraction

How to Cite

Umar, Mahmood, Hauwa Ibrahim Binji, and Anas Tukur Balarabe. 2024. “Corpus-Based Approaches for Sentiment Analysis: A Review”. Asian Journal of Research in Computer Science 17 (7):95-102.


Download data is not yet available.