Abstract:
Credit risk is risk type that financial managers give more emphasis in loan disbursement process
because it’s one of the major reasons that causes the financial institution to fail. The study of
possible application of data mining needed further investigation. To this end the present study
focuses on the application of data mining technology to support the credit risk assessment in
Oromia and Dire Saving and Credit Institution. In doing so the aim of this research was to
develop a classification model that helps in the loan disbursement decision making process of the
institution. For this research work WEKA tool was selected and the reason for choosing this tool
was its ability to support various methods for different stage of process to be conducted. After
preparing the data the last step was to build the model and evaluate. The major activity
undertaking in this step was selecting of actual modeling technique to be used, generating test
design and build the model. PART, J48, NAÏVE BAYES classification techniques were selected
for the model building. The algorisms were selected due to the reason that they are easy to
understand and interpretation the result of the model. In this research to increase assessment
neutrality K folds cross validation with 10 folds were used to test the design of the algorism.
After the algorism was selected the next step was running the model by changing the default
parameter value of the algorism. The experiment was conducted in two phases the first phase
was experiment conducted before balancing the data and the next phase was conducted after
balancing the data. Re sampling of WEKA (WEKA.FILTER. SUPERVISED.INSTANCE.
RESAMPLE) was used to balance the data set by over sample the minority class (BAD instance)
and under sample the majority (GOOD instance) of the loan and attribute selection method was
applied in each phase. The attributes were selected based on three categories the first one was
based on automatic attributes selected by the system using information gain evaluator, the
second one was best attribute selected by previous work. The third one was best attributing
selected based on the opinion of domain expert of the institution.