Semantic Search for Data on a Given Topic in Social Networks: A Comparative Study of Keyword-based and BERT-based Methods

Yerassyl Ussen *

Astana IT University, Kazakhstan.

Zuleykha Anvarovna

Astana IT University, Kazakhstan.

*Author to whom correspondence should be addressed.


Abstract

Semantic search has emerged as a powerful alternative to traditional keyword-based retrieval, particularly in the context of unstructured social media data. This study presents a comparative analysis of a semantic search system based on Sentence-BERT (SBERT) and a conventional keyword-based pipeline implemented with Elasticsearch, using a large Reddit dataset as a case study. The primary contribution lies in integrating state-of-the-art semantic modeling with scalable search infrastructure and empirically evaluating its effectiveness on real-world social media content. The experimental workflow includes six stages: dataset selection, preprocessing, embedding generation, indexing, query processing, and performance evaluation. Results show that the SBERT-based semantic search system consistently outperforms the keyword-based approach across all metrics, particularly in capturing user intent, handling informal language, and retrieving semantically relevant content despite lexical variations. Nonetheless, the semantic approach incurs higher computational costs and exhibits occasional overgeneralization.

Keywords: System built using Sentence-BERT, semantic modelling, social media, conversational language


How to Cite

Ussen, Yerassyl, and Zuleykha Anvarovna. 2025. “Semantic Search for Data on a Given Topic in Social Networks: A Comparative Study of Keyword-Based and BERT-Based Methods”. Asian Journal of Research in Computer Science 18 (6):315-24. https://doi.org/10.9734/ajrcos/2025/v18i6701.

Downloads

Download data is not yet available.