Abstract:
In daily decision-making activities, individuals or organizations regularly take other people’s
sentiments or opinions as one source of information. To be more precise about the opinion of the
people, it is crucial to consider the strength of polarity of the sentiments. Nowadays the
proliferation of the internet as websites, blogs, social networks, online portals and content sharing
services contributes enormous amount of user generated Afaan Oromo texts. Even though the
rising usage of Afaan Oromo language on the internet, there is no sufficient sentiment assortment
and cataloging method for the language. Therefore, the multi-scale sentiment analysis task which
enable to automatically extract sentiments by considering strangeness of sentiment in account are
indeed desirable in various applications. This work can play significant role in sustaining these
desires. In this study, multi-scale sentiment analysis model for Afaan Oromo text is proposed by
using bag-of-words feature representation with three supervised machine-learning algorithms:
Naïve Bayes (NB), Support Vector Machine (SVM), and Maximum Entropy (MaxEnt). The Afaan
Oromo multi-scale sentiment analysis process involves categorizing a sentence into five
predefined classes such as strong positive (+2), strong negative (-2), positive (+1), negative (-1)
and neutral (0). The proposed system contains different components like data collection,
preprocessing (tokenization, normalization, stop word removal), morphological analysis (part of
speech tagging, stemming), sentiment annotation, feature extraction/selection, training a machine
learning algorithms, classification and evaluation of the result using evaluation metrics such as
accuracy, precision, recall and f-measure. For conducting the exiperiments 1000 Afaan Oromo
sentences with sentiment are collected from different sources. In addition to this, 350 stop word
lists, 254 suffixs, 740 gazetteers of Afaan Oromo adjectives and 125 intensifiers (adverbs) are
prepared with the assistance of Afaan Oromo language experts. The experimental results shows
that performance of the system model is encouraging achieving accuracy of 74.6% for NB
classifier using 1000 sentences, and using 1200 sentences the system achieved accuracy of 83%
for SVM classifier. However, further research work such as named entity recognition, word
position and negation features, explicit and comparative sentiment analysis, standard corpus
preparation, co-reference resolution and feature or aspect level sentiment analysis are needed to
develop a full-fledged and a more efficient multi-scale sentiment analysis for Afaan Oromo