Developing Statistical Machine Translation System for English and Nigerian Languages

Ignatius Ikechukwu Ayogu *

Department of Computer Science, Federal University of Technology, Akure, Ondo State, Nigeria

Adebayo Olusola Adetunmbi

Department of Computer Science, Federal University of Technology, Akure, Ondo State, Nigeria

Bolanle Adefowoke Ojokoh

Department of Computer Science, Federal University of Technology, Akure, Ondo State, Nigeria

*Author to whom correspondence should be addressed.


Abstract

The global demand for translation and translation tools currently surpasses the capacity of available solutions. Besides, there is no one-solution-fits-all, off-the-shelf solution for all languages. Thus, the need and urgency to increase the scale of research for the development of translation tools and devices continue to grow, especially for languages suffering under the pressure of globalisation. This paper discusses our experiments on translation systems between English and two Nigerian languages: Igbo and Yorùbá. The study is setup to build parallel corpora, train and experiment English-to-Igbo, (), English-to-Yorùbá, () and Igbo-to-Yorùbá, () phrase-based statistical machine translation systems. The systems were trained on parallel corpora that were created for each language pair using text from the religious domain in the course of this research. A BLEU score of 30.04, 29.01 and 18.72 respectively was recorded for the English-to-Igbo, English-to-Yorùbá and Igbo-to-Yorùbá MT systems. An error analysis of the systems’ outputs was conducted using a linguistically motivated MT error analysis approach and it showed that errors occurred mostly at the lexical, grammatical and semantic levels. While the study reveals the potentials of our corpora, it also shows that the size of the corpora is yet an issue that requires further attention. Thus an important target in the immediate future is to increase the quantity and quality of the data.

 

Keywords: Machine translation, Igbo language, Yoruba language, parallel corpora, SMT


How to Cite

Ikechukwu Ayogu, Ignatius, Adebayo Olusola Adetunmbi, and Bolanle Adefowoke Ojokoh. 2018. “Developing Statistical Machine Translation System for English and Nigerian Languages”. Asian Journal of Research in Computer Science 1 (4):1-8. https://doi.org/10.9734/ajrcos/2018/v1i424761.

Downloads

Download data is not yet available.