Search | VHL Regional Portal

A comparison of deep transfer learning backbone architecture techniques for printed text detection of different font styles from unstructured documents.

Mahadevkar, Supriya; Patil, Shruti; Kotecha, Ketan; Abraham, Ajith.

PeerJ Comput Sci ; 10: e1769, 2024.

Article in English | MEDLINE | ID: mdl-38686011

ABSTRACT

Object detection methods based on deep learning have been used in a variety of sectors including banking, healthcare, e-governance, and academia. In recent years, there has been a lot of attention paid to research endeavors made towards text detection and recognition from different scenesor images of unstructured document processing. The article's novelty lies in the detailed discussion and implementation of the various transfer learning-based different backbone architectures for printed text recognition. In this research article, the authors compared the ResNet50, ResNet50V2, ResNet152V2, Inception, Xception, and VGG19 backbone architectures with preprocessing techniques as data resizing, normalization, and noise removal on a standard OCR Kaggle dataset. Further, the top three backbone architectures selected based on the accuracy achieved and then hyper parameter tunning has been performed to achieve more accurate results. Xception performed well compared with the ResNet, Inception, VGG19, MobileNet architectures by achieving high evaluation scores with accuracy (98.90%) and min loss (0.19). As per existing research in this domain, until now, transfer learning-based backbone architectures that have been used on printed or handwritten data recognition are not well represented in literature. We split the total dataset into 80 percent for training and 20 percent for testing purpose and then into different backbone architecture models with the same number of epochs, and found that the Xception architecture achieved higher accuracy than the others. In addition, the ResNet50V2 model gave us higher accuracy (96.92%) than the ResNet152V2 model (96.34%).

Enhancement of handwritten text recognition using AI-based hybrid approach.

Mahadevkar, Supriya; Patil, Shruti; Kotecha, Ketan.

MethodsX ; 12: 102654, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38510932

ABSTRACT

Handwritten text recognition (HTR) within computer vision and image processing stands as a prominent and challenging research domain, holding significant implications for diverse applications. Among these, it finds usefulness in reading bank checks, prescriptions, and deciphering characters on various forms. Optical character recognition (OCR) technology, specifically tailored for handwritten documents, plays a pivotal role in translating characters from a range of file formats, encompassing both word and image documents. Challenges in HTR encompass intricate layout designs, varied handwriting styles, limited datasets, and less accuracy achieved. Recent advancements in Deep Learning and Machine Learning algorithms, coupled with the vast repositories of unprocessed data, have propelled researchers to achieve remarkable progress in HTR. This paper aims to address the challenges in handwritten text recognition by proposing a hybrid approach. The primary objective is to enhance the accuracy of recognizing handwritten text from images. Through the integration of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) with a Connectionist Temporal Classification (CTC) decoder, the results indicate substantial improvement. The proposed hybrid model achieved an impressive 98.50% and 98.80% accuracy on the IAM and RIMES datasets, respectively. This underscores the potential and efficacy of the consecutive use of these advanced neural network architectures in enhancing handwritten text recognition accuracy. â¢The proposed method introduces a hybrid approach for handwritten text recognition, employing CNN and BiLSTM with CTC decoder.â¢Results showcase a remarkable accuracy improvement of 98.50% and 98.80% on IAM and RIMES datasets, emphasizing the potential of this model for enhanced accuracy in recognizing handwritten text from images.

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL