SKELETON-BASED ETHIOPIAN SIGN LANGUAGE RECOGNITION USING DEEP LEARNING

Deme Kuma; Million Meshesha (Ph.D.); Kidane W/Maryam (M.Sc.)

dc.contributor.author	Deme Kuma
dc.contributor.author	Million Meshesha (Ph.D.)
dc.contributor.author	Kidane W/Maryam (M.Sc.)
dc.date.accessioned	2023-11-20T07:27:02Z
dc.date.available	2023-11-20T07:27:02Z
dc.date.issued	2023-11
dc.identifier.uri	http://ir.haramaya.edu.et//hru/handle/123456789/6907
dc.description	118	en_US
dc.description.abstract	Recent reports show that there are over 1.5 billion people around the globe with hearing impairment, and here in Ethiopia, their number is estimated to be over 1.2 million. These people use Sign Language as a way of communication using manual and non-manual signs. However, Sign Language is only understood by the deaf community and some of their families. This creates a communication gap between them and the rest of the world. Although interpreters try to fill the gap, it is not enough compared to the communication demand. Hence, Automatic Sign Language Recognition (ASLR) is being studied for various Sign Languages in the world to fill the communication gap. ASLR methods involved techniques ranging from traditional machine learning to modern deep learning. Regarding Ethiopian Sign Language, few attempts were made to automate the recognition of Ethiopian Sign Language. However, they were found to be environment and signer dependents. These gaps hinder the journey to commercialize fully automated Sign Language Recognition products. Consequently, this study proposes an environment and signer invariant Sign Language Recognition model. The model first extracts skeletal key-points from the signer using MediaPipe, which is Google’s cross-platform pipeline framework that helps to detect and track human pose, face landmarks, and hands. After preprocessing the skeletal key-points information, feature extraction and learning are performed using deep learning architectures; Convolutional Neural Network followed by Long Short-Term Memory (CNN-LSTM), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Gated Recurrent Units (GRU). In this study, the models were trained to classify twenty (20) isolated dynamic Ethiopian Sign Language signs. A total of 5600 video samples were collected from volunteer students at Haramaya University and used to train and test the deep learning-based models. First, all the models were trained and tested in a signer-dependent mode where GRU outperformed the other deep learning algorithms with 94% recognition accuracy. The outperforming GRU-based model was further tested in signer-independent mode and attained 73% recognition accuracy. The outcome of this study shows that Ethiopian Sign Language can be recognized in real-time within dynamic environments. It also implied that signer-independence can be achieved. This study attempted to take the signer-independence of ASLR models up to some level. However, further studies are required to recognize continuous signs in a fully open environment. Therefore, the technique implemented xiv to detect and track key-points in this study should further be investigated to recognize continuous EthSL.	en_US
dc.description.sponsorship	Haramaya University	en_US
dc.language.iso	en	en_US
dc.publisher	Haramaya University	en_US
dc.subject	Manual signs, non-manual signs, Automatic Sign Language Recognition, skeleton key-points, Deep Neural Networks, signer-independence	en_US
dc.title	SKELETON-BASED ETHIOPIAN SIGN LANGUAGE RECOGNITION USING DEEP LEARNING	en_US
dc.type	Thesis	en_US