AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network
Abstract Amharic ( ) is the official language of the Federal Government of Ethiopia, with more than 27 million speakers. It uses an Ethiopic script, which has 238 core and 27 labialized characters. It is a low-resourced language, and a few attempts have been made so far for its handwritten text reco...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2021-07-01
|
Series: | SN Applied Sciences |
Subjects: | |
Online Access: | https://doi.org/10.1007/s42452-021-04742-x |
id |
doaj-420633850f0a4762ba0b9a38bea24acf |
---|---|
record_format |
Article |
spelling |
doaj-420633850f0a4762ba0b9a38bea24acf2021-08-01T11:14:24ZengSpringerSN Applied Sciences2523-39632523-39712021-07-013811110.1007/s42452-021-04742-xAHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural networkFetulhak Abdurahman0Eyob Sisay1Kinde Anlay Fante2Faculty of Electrical and Computer Engineering, JiT, Jimma UniversityFaculty of Electrical and Computer Engineering, JiT, Jimma UniversityFaculty of Electrical and Computer Engineering, JiT, Jimma UniversityAbstract Amharic ( ) is the official language of the Federal Government of Ethiopia, with more than 27 million speakers. It uses an Ethiopic script, which has 238 core and 27 labialized characters. It is a low-resourced language, and a few attempts have been made so far for its handwritten text recognition. However, Amharic handwritten text recognition is challenging due to the very high similarity between characters. This paper presents a convolutional recurrent neural networks based offline handwritten Amharic word recognition system. The proposed framework comprises convolutional neural networks (CNNs) for feature extraction from input word images, recurrent neural network (RNNs) for sequence encoding, and connectionist temporal classification as a loss function. We designed a custom CNN model and compared its performance with three different state-of-the-art CNN models, including DenseNet-121, ResNet-50 and VGG-19 after modifying their architectures to fit our problem domain, for robust feature extraction from handwritten Amharic word images. We have conducted detailed experiments with different CNN and RNN architectures, input word image sizes, and applied data augmentation techniques to enhance performance of the proposed models. We have prepared a handwritten Amharic word dataset, HARD-I, which is available publicly for researchers. From the experiments on various recognition models using our dataset, a WER of 5.24 % and CER of 1.15 % were achieved using our best-performing recognition model. The proposed models achieve a competitive performance compared to existing models for offline handwritten Amharic word recognition.https://doi.org/10.1007/s42452-021-04742-xAmharicCNNCTCHandwrittenLSTMRecognition |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Fetulhak Abdurahman Eyob Sisay Kinde Anlay Fante |
spellingShingle |
Fetulhak Abdurahman Eyob Sisay Kinde Anlay Fante AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network SN Applied Sciences Amharic CNN CTC Handwritten LSTM Recognition |
author_facet |
Fetulhak Abdurahman Eyob Sisay Kinde Anlay Fante |
author_sort |
Fetulhak Abdurahman |
title |
AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network |
title_short |
AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network |
title_full |
AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network |
title_fullStr |
AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network |
title_full_unstemmed |
AHWR-Net: offline handwritten amharic word recognition using convolutional recurrent neural network |
title_sort |
ahwr-net: offline handwritten amharic word recognition using convolutional recurrent neural network |
publisher |
Springer |
series |
SN Applied Sciences |
issn |
2523-3963 2523-3971 |
publishDate |
2021-07-01 |
description |
Abstract Amharic ( ) is the official language of the Federal Government of Ethiopia, with more than 27 million speakers. It uses an Ethiopic script, which has 238 core and 27 labialized characters. It is a low-resourced language, and a few attempts have been made so far for its handwritten text recognition. However, Amharic handwritten text recognition is challenging due to the very high similarity between characters. This paper presents a convolutional recurrent neural networks based offline handwritten Amharic word recognition system. The proposed framework comprises convolutional neural networks (CNNs) for feature extraction from input word images, recurrent neural network (RNNs) for sequence encoding, and connectionist temporal classification as a loss function. We designed a custom CNN model and compared its performance with three different state-of-the-art CNN models, including DenseNet-121, ResNet-50 and VGG-19 after modifying their architectures to fit our problem domain, for robust feature extraction from handwritten Amharic word images. We have conducted detailed experiments with different CNN and RNN architectures, input word image sizes, and applied data augmentation techniques to enhance performance of the proposed models. We have prepared a handwritten Amharic word dataset, HARD-I, which is available publicly for researchers. From the experiments on various recognition models using our dataset, a WER of 5.24 % and CER of 1.15 % were achieved using our best-performing recognition model. The proposed models achieve a competitive performance compared to existing models for offline handwritten Amharic word recognition. |
topic |
Amharic CNN CTC Handwritten LSTM Recognition |
url |
https://doi.org/10.1007/s42452-021-04742-x |
work_keys_str_mv |
AT fetulhakabdurahman ahwrnetofflinehandwrittenamharicwordrecognitionusingconvolutionalrecurrentneuralnetwork AT eyobsisay ahwrnetofflinehandwrittenamharicwordrecognitionusingconvolutionalrecurrentneuralnetwork AT kindeanlayfante ahwrnetofflinehandwrittenamharicwordrecognitionusingconvolutionalrecurrentneuralnetwork |
_version_ |
1721246084946198528 |