Abstract:
A grammar checker is one of the basic NLP applications used to check whether sentences are
grammatically correct or not. To solve the Afaan Oromo grammar error problem, an Afaan
Oromo grammar checker using a hybrid approach is proposed. To achieve the goal of this study,
each statistical and rule-based approach acts as a module. Afaan Oromo's sentences were
checked for word order errors using a statistical grammar checker module. While a rule-based
grammar checker module is used to check morphological agreement errors. The rule-based
grammar checker module was tested after the statistical grammar checker module. Because if
the word order of the sentence is correct. Language grammar rules can be used to resolve errors
in morphological agreement errors. In the statistical approach, the bi-gram statistical technique
checks the grammatical correctness of a sentence by calculating the probability of a bigram
sequence of tags in both the training and test datasets. If a sentence is found to be free of word
order errors, a rule-based module is run to check the sentence for morphological agreement
errors and make suggestions if there are errors. For the experiment, the POS tagset from (Emiru
2016) was used. POS tagger corpus was manually prepared from 2000 sentences collected from
OBN. The tagger used 85% for training and 15% for testing the HMM Viterbi model. A tag
sequence corpus of 570 sentences was manually prepared and used by the statistical bigram
model. To handle agreement errors, 150 rules were manually created and used by a rule-based
grammar checker. The system was implemented using the Python programming language
Python3 and Jupyter notebook tools. In the implementation of this work, the prepared and
collected data were first preprocessed to be ready for use in each module of the grammar
checker. From the conducted experiment, the researcher manually prepared 255 sentences and
measured the average performance evaluation of the Afaan Oromo grammar checker using
hybrid approaches. The evaluated result in the test sentences was 82% precision, 76% recall,
and 79% of F-measure. Using a hybrid approach for the Afaan Oromo grammar checker
achieves good results. The use of a high-quality Afaan Oromo corpus and a deep learning
approach will be the future work to improve the performance