WordNet-Based Semantically Improved Frequency of Terms for Arabic Information Reclamation
Hiba ALMarwi *
Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.
Marwa Al-Hadi
Department of Computer Science, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.
Mohammed Mohammed Zayed
Department of Information System, Faculty of Computer and IT, Sana’a University, Sana’a, Yemen.
*Author to whom correspondence should be addressed.
Abstract
Information retrieval systems often struggle to retrieve relevant documents due to term mismatch and the limitations of short user queries. Query Expansion (QE) has been extensively employed to address these challenges by augmenting the original query with additional terms. In this study, a novel query expansion approach is proposed that integrates pseudo-relevance feedback with WordNet. WordNet is a lexical semantic resource used in NLP and information retrieval to support semantic understanding and query expansion beyond keyword matching. To mitigate the mismatch, issue inherent in relevance feedback, WordNet is utilized to improve semantic similarity measurements between terms. However, the inclusion of expanded terms may introduce noise into the retrieval process. To address this, the Crow Search Algorithm (CSA) is applied as a filtering mechanism to select semantically relevant terms that better reflect user intent. Performance evaluation is conducted using the Mean Average Precision (MAP) metric, with a comparative analysis against the MAP values of a baseline standard search system. Experimental evaluation on a real-world dataset demonstrates the effectiveness of the proposed approach.
Keywords: Query expansion, term frequency, Crow Search Algorithm, information retrieval.