Abstract:
Machine translation (MT) is an automatic translation from one natural language to another by a
computer, without human involvement. The problem of accurately translating complex and
context-dependent sentences between Afaan Oromoo and Amharic, which poses a significant
barrier to communication and information exchange between the two communities. The purpose
of this study is to develop a bidirectional Amharic- Afaan Oromo machine translation system
using a hybrid approach.
In order to conduct the study, the corpus was collected from an online source which is GitHub,
for both language and corpus preparation which also involves dividing the corpus into the
training set, tuning set and test set. A total of 11457 sentences are collected. We used 1146 for
testing and 1146 for tuning purposes. The experiment was conducted using Machine Translation
tool Moses for mere mortal, GIZA++ for alignment, IRSTLM language modelling tools,
OpenNMT-py for developing NMT and HMT models, Google Colab for training the model and
Bilingual Evaluation Under Study (BLEU) for evaluating the translation quality of our model.
We observed that the hybrid approach has an advisable training rate and translation quality which
is 17.2275% and 21.3589% are achieved for Afaan oromoo to Amharic and Amharic to Afaan
oromoo respectively than the SMT and NMT models achieved by Experiment SMT of this work,
which was the BLEU scores of 10.10 and 19.82. Afaan oromoo to Amharic and Amharic to
Afaan oromoo respectively using SMT and the BLEU scores of 15.3315 and 18.5179 Afaan
oromoo to Amharic and Amharic to Afaan oromoo respectively using NMT. The Hybrid
approach showed an amendment result over the SMT results with a BLEU score of 7.1275 from
Afaan oromoo to Amharic improvement and with a BLEU score of 1.5389 Amharic to Afaan
oromoo is 41.37% and 7.21% improvement respectively, and the Hybrid approach showed an
amendment result over the NMT results with a BLEU score of 1.896 from Afaan oromoo to
Amharic improvement and with a BLEU score of 2.841 Amharic to Afaan oromoo that is 11.01%
and 13.30% improvement respectively. Doing further research with a clean larger corpus size
may improve the result we have reported in this work