Abstract:
Information retrieval enables to search for relevant documents from large corpus as per the
information need of users. Query expansion is widely used technique for improving information
retrieval effectiveness. Afaan Oromo is a Cushitic language spoken today by about 40 million
people in Ethiopia. One of the major problems of Afaan Oromo text retrieval is its effectiveness
in identifying relevant documents for users’ query that satisfies their information need. The main
objective of this study is to integrate query expansion for enhancing the effectiveness of Afaan
Oromo text retrieval system. The designed query expansion for Afaan Oromo information
retrieval system involves lexical resource like WordNet that is constructed as reference for
identifying the senses and meaning of the user’s query using word sense disambiguation by
semantic similarity measure. Using the idea of original Lesk algorithm, word sense
disambiguation is performed with gloss to gloss similarity measure by comparing information
associated with its synonyms and gloss definition with reference to Afaan Oromo WordNet. The
well-known word senses that are identified during word sense disambiguation from WordNet is
used during query reformulation. Finally, the query expansion module is integrated with Afaan
Oromo IR system to enhance the effective performance of the system after query expansion is
applied. The experimental result shows that an integration of query expansion registers 56% Fmeasure
which improves the performance by 5% from original query. The main challenges in
this study are absence of standard well-crafted WordNet, effective stemmer algorithm and
corpus for performance evaluation. It is therefore the researcher major recommendation for
researchers to work in this line.