Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
1.
New Microbes New Infect ; 62: 101469, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-39282140

RESUMO

Background: Collecting and standardizing clinical research data is a very tedious task. This study is to develop an intelligent data collection tool, named CHB-EDC, for real-world cohort studies of chronic hepatitis B (CHB), which can assist in standardized and efficient data collection. Methods: CHB_EDC is capable of automatically processing various formats of data, including raw data in image format, using internationally recognized data standards, OCR, and NLP models. It can automatically populate the data into eCRFs designed in the REDCap system, supporting the integration of patient data from electronic medical record systems through commonly used web application interfaces. This tool enables intelligent extraction and aggregation of data, as well as secure and anonymous data sharing. Results: For non-electronic data collection, the average accuracy of manual collection was 98.65 %, with an average time of 63.64 min to collect information for one patient. The average accuracy CHB-EDC was 98.66 %, with an average time of 3.57 min to collect information for one patient. In the same data collection task, CHB-EDC achieved a comparable average accuracy to manual collection. However, in terms of time, CHB-EDC significantly outperformed manual collection (p < 0.05). Our research has significantly reduced the required collection time and lowered the cost of data collection while ensuring accuracy. Conclusion: The tool has significantly improved the efficiency of data collection while ensuring accuracy, enabling standardized collection of real-world data.

2.
Data Brief ; 56: 110813, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39252777

RESUMO

Developing Deep Learning Optical Character Recognition is an active area of research, where models based on deep neural networks are trained on data to eventually extract text within an image. Even though many advances are currently being made in this area in general, the Arabic OCR domain notably lacks a dataset for ancient manuscripts. Here, we fill this gap by providing both the image and textual ground truth for a collection of ancient Arabic manuscripts. This scarce dataset is collected from the central library of the Islamic University of Madinah, and it encompasses rich text spanning different geographies across centuries. Specifically, eight ancient books with a total of forty pages, both images and text, transcribed by the experts, are present in this dataset. Particularly, this dataset holds a significant value due to the unavailability of such data publicly, which conspicuously contributes to the deep learning models development/augmenting, validation, testing, and generalization by researchers and practitioners, both for the tasks of Arabic OCR and Arabic text correction.

3.
Heliyon ; 10(16): e35959, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39229500

RESUMO

The Pegon script is an Arabic-based writing system used for Javanese, Sundanese, Madurese, and Indonesian languages. Due to various reasons, this script is now mainly found among collectors and private Islamic boarding schools (pesantren), creating a need for its preservation. One preservation method is digitization through transcription into machine-encoded text, known as OCR (Optical Character Recognition). No published literature exists on OCR systems for this specific script. This research explores the OCR of Pegon typed manuscripts, introducing novel synthesized and real annotated datasets for this task. These datasets evaluate proposed OCR methods, especially those adapted from existing Arabic OCR systems. Results show that deep learning techniques outperform conventional ones, which fail to detect Pegon text. The proposed system uses YOLOv5 for line segmentation and a CTC-CRNN architecture for line text recognition, achieving an F1-score of 0.94 for segmentation and a CER of 0.03 for recognition.

4.
Data Brief ; 56: 110783, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39252768

RESUMO

This dataset presents a comprehensive collection of handwritten Grantha characters, comprising numbers and vowels, gathered from participants spanning diverse age groups. Utilizing standard A4 sheets, participants were instructed to handwrite Grantha characters. The Grantha script encompasses 10 numbers and 34 vowels. The Grantha Character dataset comprises 44 distinct characters of numbers and vowels. A dataset comprising 133 handwritten samples for each number and 133 for each vowel was collected. These samples underwent digitization and preprocessing steps, including segmentation, resizing, and grayscale conversion. The final dataset consists of 5852 images, comprising 1330 samples for numbers and 4522 samples for vowels. The data is provided in both image and CSV formats, accompanied by corresponding labels. facilitating its utilization in machine learning model development. With limited datasets available for the Grantha script, this contribution addresses a significant gap by providing a benchmark dataset for Grantha numeral and vowel recognition. Moreover, this novel dataset serves as a fundamental resource for commencing machine learning research in Indian languages that have historical connections to the Grantha script.

5.
PeerJ Comput Sci ; 10: e2124, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39145239

RESUMO

Pashtu is one of the most widely spoken languages in south-east Asia. Pashtu Numerics recognition poses challenges due to its cursive nature. Despite this, employing a machine learning-based optical character recognition (OCR) model can be an effective way to tackle this issue. The main aim of the study is to propose an optimized machine learning model which can efficiently identify Pashtu numerics from 0-9. The methodology includes data organizing into different directories each representing labels. After that, the data is preprocessed i.e., images are resized to 32 × 32 images, then they are normalized by dividing their pixel value by 255, and the data is reshaped for model input. The dataset was split in the ratio of 80:20. After this, optimized hyperparameters were selected for LSTM and CNN models with the help of trial-and-error technique. Models were evaluated by accuracy and loss graphs, classification report, and confusion matrix. The results indicate that the proposed LSTM model slightly outperforms the proposed CNN model with a macro-average of precision: 0.9877, recall: 0.9876, F1 score: 0.9876. Both models demonstrate remarkable performance in accurately recognizing Pashtu numerics, achieving an accuracy level of nearly 98%. Notably, the LSTM model exhibits a marginal advantage over the CNN model in this regard.

6.
Sci Rep ; 14(1): 14389, 2024 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-38909147

RESUMO

Vehicle identification systems are vital components that enable many aspects of contemporary life, such as safety, trade, transit, and law enforcement. They improve community and individual well-being by increasing vehicle management, security, and transparency. These tasks entail locating and extracting license plates from images or video frames using computer vision and machine learning techniques, followed by recognizing the letters or digits on the plates. This paper proposes a new license plate detection and recognition method based on the deep learning YOLO v8 method, image processing techniques, and the OCR technique for text recognition. For this, the first step was the dataset creation, when gathering 270 images from the internet. Afterward, CVAT (Computer Vision Annotation Tool) was used to annotate the dataset, which is an open-source software platform made to make computer vision tasks easier to annotate and label images and videos. Subsequently, the newly released Yolo version, the Yolo v8, has been employed to detect the number plate area in the input image. Subsequently, after extracting the plate the k-means clustering algorithm, the thresholding techniques, and the opening morphological operation were used to enhance the image and make the characters in the license plate clearer before using OCR. The next step in this process is using the OCR technique to extract the characters. Eventually, a text file containing only the character reflecting the vehicle's country is generated. To ameliorate the efficiency of the proposed approach, several metrics were employed, namely precision, recall, F1-Score, and CLA. In addition, a comparison of the proposed method with existing techniques in the literature has been given. The suggested method obtained convincing results in both detection as well as recognition by obtaining an accuracy of 99% in detection and 98% in character recognition.

7.
Data Brief ; 54: 110473, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38774242

RESUMO

About 26 million people worldwide use the Saraiki language [1]. In the southern part of Punjab and Sindh, Saraiki language is extensively spoken. One of the most important Saraiki cultural hubs is Dera Ghazi Khan. In Dera Ghazi Khan, the Saraiki language is spoken by over 90 % of the population. Calligraphers use a sophisticated script to write this language. Despite the vast body of Optical Character Recognition (OCR) literature and research dedicated to other languages, a fully functional OCR system is still needed for Saraiki language [2,3]. This work presents a genuine dataset of Saraiki handwritten characters, consisting of 50,000 scanned photos, and makes it accessible to the public for use. All of the photographs include handwritten text contributed by teachers and students from Pak-Austria Fachhochschule for Applied Sciences and Technology, Pakistan. Around 1000 people, roughly half men and half women, contributed in writing this text. For scientific research, the dataset will be made accessible to the general public.

8.
PeerJ Comput Sci ; 10: e1925, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38660206

RESUMO

This article introduces a recognition system for handwritten text in the Pashto language, representing the first attempt to establish a baseline system using the Pashto Handwritten Text Imagebase (PHTI) dataset. Initially, the PHTI dataset underwent pre-processed to eliminate unwanted characters, subsequently, the dataset was divided into training 70%, validation 15%, and test sets 15%. The proposed recognition system is based on multi-dimensional long short-term memory (MD-LSTM) networks. A comprehensive empirical analysis was conducted to determine the optimal parameters for the proposed MD-LSTM architecture; Counter experiments were used to evaluate the performance of the proposed system comparing with the state-of-the-art models on the PHTI dataset. The novelty of our proposed model, compared to other state of the art models, lies in its hidden layer size (i.e., 10, 20, 80) and its Tanh layer size (i.e., 20, 40). The system achieves a Character Error Rate (CER) of 20.77% as a baseline on the test set. The top 20 confusions are reported to check the performance and limitations of the proposed model. The results highlight complications and future perspective of the Pashto language towards the digital transition.

9.
Front Neurosci ; 18: 1362567, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38680450

RESUMO

Handwritten character recognition is one of the classical problems in the field of image classification. Supervised learning techniques using deep learning models are highly effective in their application to handwritten character recognition. However, they require a large dataset of labeled samples to achieve good accuracies. Recent supervised learning techniques for Kannada handwritten character recognition have state of the art accuracy and perform well over a large range of input variations. In this work, a framework is proposed for the Kannada language that incorporates techniques from semi-supervised learning. The framework uses features extracted from a convolutional neural network backbone and uses regularization to improve the trained features and label propagation to classify previously unseen characters. The episodic learning framework is used to validate the framework. Twenty-four classes are used for pre-training, 12 classes are used for testing and 11 classes are used for validation. Fine-tuning is tested using one example per unseen class and five examples per unseen class. Through experimentation the components of the network are implemented in Python using the Pytorch library. It is shown that the accuracy obtained 99.13% make this framework competitive with the currently available supervised learning counterparts, despite the large reduction in the number of labeled samples available for the novel classes.

10.
Sensors (Basel) ; 24(7)2024 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-38610392

RESUMO

The decipherment of ancient Chinese scripts, such as oracle bone and bronze inscriptions, holds immense significance for understanding ancient Chinese history, culture, and civilization. Despite substantial progress in recognizing oracle bone script, research on the overall recognition of ancient Chinese characters remains somewhat lacking. To tackle this issue, we pioneered the construction of a large-scale image dataset comprising 9233 distinct ancient Chinese characters sourced from images obtained through archaeological excavations. We propose the first model for recognizing the common ancient Chinese characters. This model consists of four stages with Linear Embedding and Swin-Transformer blocks, each supplemented by a CoT Block to enhance local feature extraction. We also advocate for an enhancement strategy, which involves two steps: firstly, conducting adaptive data enhancement on the original data, and secondly, randomly resampling the data. The experimental results, with a top-one accuracy of 87.25% and a top-five accuracy of 95.81%, demonstrate that our proposed method achieves remarkable performance. Furthermore, through the visualizing of model attention, it can be observed that the proposed model, trained on a large number of images, is able to capture the morphological characteristics of ancient Chinese characters to a certain extent.

11.
PeerJ Comput Sci ; 10: e1769, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38686011

RESUMO

Object detection methods based on deep learning have been used in a variety of sectors including banking, healthcare, e-governance, and academia. In recent years, there has been a lot of attention paid to research endeavors made towards text detection and recognition from different scenesor images of unstructured document processing. The article's novelty lies in the detailed discussion and implementation of the various transfer learning-based different backbone architectures for printed text recognition. In this research article, the authors compared the ResNet50, ResNet50V2, ResNet152V2, Inception, Xception, and VGG19 backbone architectures with preprocessing techniques as data resizing, normalization, and noise removal on a standard OCR Kaggle dataset. Further, the top three backbone architectures selected based on the accuracy achieved and then hyper parameter tunning has been performed to achieve more accurate results. Xception performed well compared with the ResNet, Inception, VGG19, MobileNet architectures by achieving high evaluation scores with accuracy (98.90%) and min loss (0.19). As per existing research in this domain, until now, transfer learning-based backbone architectures that have been used on printed or handwritten data recognition are not well represented in literature. We split the total dataset into 80 percent for training and 20 percent for testing purpose and then into different backbone architecture models with the same number of epochs, and found that the Xception architecture achieved higher accuracy than the others. In addition, the ResNet50V2 model gave us higher accuracy (96.92%) than the ResNet152V2 model (96.34%).

12.
Carbohydr Polym ; 332: 121932, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38431422

RESUMO

Conductive hydrogel-based sensors offer diverse applications in artificial intelligence, wearable electronic devices and character recognition management. However, it remains a significant challenge to maintain their satisfactory performances under extreme climatic conditions. Herein, a stretchable, self-adhesive, self-healing and environmentally stable conductive hydrogel was developed through free radical polymerization of hydroxyethyl acrylate (HEA) and poly(ethylene glycol) methacrylate (PEG) as the skeleton, followed by the incorporation of polyaniline-coated cellulose nanocrystal (CNC@PANI) as the conductive and reinforced nanofiller. Encouragingly, the as-prepared hydrogel (CHP) exhibited decent mechanical strength, satisfactory self-adhesion, prominent self-healing property (95.04 % after 60 s), excellent anti-freezing performance (below -60 °C) and outstanding moisture retention. The assembled sensor derived from CHP hydrogel possessed a low detection limit (0.5 % strain), high strain sensitivity (GF = 1.68) and fast response time (96 ms). Remarkably, even in harsh environmental temperatures from -60 °C to 80 °C, it reliably detected subtle and large-scale human motion for a long-term process (>10,000 cycles), manifesting its exceptional environmental tolerance. More interestingly, this hydrogel-based sensor could be assembled into a "writing board" for accurate handwritten numeral recognition. Therefore, the as-obtained multifunctional hydrogel could be a promising material applied in human motion detection and character recognition platforms even in harsh surroundings.

13.
Data Brief ; 52: 110038, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38333425

RESUMO

A dataset was created by collecting handwritten samples of distinct Kurdish characters. The dataset consists primarily of 58 characters, and approximately 3800 adult volunteers who are native Kurdish speakers participated in the collection process. Each participant was requested to fill two rows in a character form printed on A4 landscape papers. These papers were divided into sets of four pages, with 18 columns and 10 rows of characters on each page, except for the fourth page in each set, which had 40 cells. To ensure a comprehensive dataset, over 760 sets were prepared and distributed across various universities and institutions. The collected samples underwent scanning, cropping, and preprocessing procedures following the characteristics established by the EMNIST project. The purpose of these procedures was to standardize the dataset and ensure uniformity in the representation of all characters.

14.
Data Brief ; 52: 109953, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38186736

RESUMO

This article focuses on the construction of a dataset for multilingual character recognition in Moroccan official documents. The dataset covers languages such as Arabic, French, and Tamazight and are built programmatically to ensure data diversity. It consists of sub-datasets such as Uppercase alphabet (26 classes), Lowercase alphabet (26 classes), Digits (9 classes), Arabic (28 classes), Tifinagh letters (33 classes), Symbols (14 classes), and French special characters (16 classes). The dataset construction process involves collecting representative fonts and generating multiple character images using a Python script, presenting a comprehensive variety essential for robust recognition models. Moreover, this dataset contributes to the digitization of these diverse official documents and archival papers, essential for preserving cultural heritage and enabling advanced text recognition technologies. The need for this work arises from the advancements in character recognition techniques and the significance of large-scale annotated datasets. The proposed dataset contributes to the development of robust character recognition models for practical applications.

15.
Behav Res Methods ; 56(6): 5732-5753, 2024 09.
Artigo em Inglês | MEDLINE | ID: mdl-38114882

RESUMO

We present a psycholinguistic study investigating lexical effects on simplified Chinese character recognition by deaf readers. Prior research suggests that deaf readers exhibit efficient orthographic processing and decreased reliance on speech-based phonology in word recognition compared to hearing readers. In this large-scale character decision study (25 participants, each evaluating 2500 real characters and 2500 pseudo-characters), we analyzed various factors influencing character recognition accuracy and speed in deaf readers. Deaf participants demonstrated greater accuracy and faster recognition when characters were more frequent, were acquired earlier, had more strokes, displayed higher orthographic complexity, were more imageable in reference, or were less concrete in reference. Comparison with a previous study of hearing readers revealed that the facilitative effect of frequency on character decision accuracy was stronger for deaf readers than hearing readers. The effect of orthographic-phonological regularity differed significantly for the two groups, indicating that deaf readers rely more on orthographic structure and less on phonological information during character recognition. Notably, increased stroke counts (i.e., higher orthographic complexity) hindered hearing readers but facilitated recognition processes in deaf readers, suggesting that deaf readers excel at recognizing characters based on orthographic structure. The database generated from this large-scale character decision study offers a valuable resource for further research and practical applications in deaf education and literacy.


Assuntos
Surdez , Leitura , Humanos , Masculino , Feminino , Surdez/fisiopatologia , Adulto , Adulto Jovem , Psicolinguística/métodos , China , Tomada de Decisões/fisiologia , Pessoas com Deficiência Auditiva/psicologia , Idioma
16.
IEEE J Transl Eng Health Med ; 11: 523-535, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38059065

RESUMO

OBJECTIVE: People with blindness and low vision face substantial challenges when navigating both indoor and outdoor environments. While various solutions are available to facilitate travel to and from public transit hubs, there is a notable absence of solutions for navigating within transit hubs, often referred to as the "middle mile". Although research pilots have explored the middle mile journey, no solutions exist at scale, leaving a critical gap for commuters with disabilities. In this paper, we proposed a novel mobile application, Commute Booster, that offers full trip planning and real-time guidance inside the station. METHODS AND PROCEDURES: Our system consists of two key components: the general transit feed specification (GTFS) and optical character recognition (OCR). The GTFS dataset generates a comprehensive list of wayfinding signage within subway stations that users will encounter during their intended journey. The OCR functionality enables users to identify relevant navigation signs in their immediate surroundings. By seamlessly integrating these two components, Commute Booster provides real-time feedback to users regarding the presence or absence of relevant navigation signs within the field of view of their phone camera during their journey. RESULTS: As part of our technical validation process, we conducted tests at three subway stations in New York City. The sign detection achieved an impressive overall accuracy rate of 0.97. Additionally, the system exhibited a maximum detection range of 11 meters and supported an oblique angle of approximately 110 degrees for field of view detection. CONCLUSION: The Commute Booster mobile application relies on computer vision technology and does not require additional sensors or infrastructure. It holds tremendous promise in assisting individuals with blindness and low vision during their daily commutes. Clinical and Translational Impact Statement: Commute Booster translates the combination of OCR and GTFS into an assistive tool, which holds great promise for assisting people with blindness and low vision in their daily commute.


Assuntos
Aplicativos Móveis , Tecnologia Assistiva , Baixa Visão , Humanos , Meios de Transporte , Cegueira
17.
Front Neurosci ; 17: 1257611, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38094002

RESUMO

Alternative paradigms to the von Neumann computing scheme are currently arousing huge interest. Oscillatory neural networks (ONNs) using emerging phase-change materials like VO2 constitute an energy-efficient, massively parallel, brain-inspired, in-memory computing approach. The encoding of information in the phase pattern of frequency-locked, weakly coupled oscillators makes it possible to exploit their rich non-linear dynamics and their synchronization phenomena for computing. A single fully connected ONN layer can implement an auto-associative memory comparable to that of a Hopfield network, hence Hebbian learning rule is the most widely adopted method for configuring ONNs for such applications, despite its well-known limitations. An extensive amount of literature is available about learning in Hopfield networks, with information regarding many different learning algorithms that perform better than the Hebbian rule. However, not all of these algorithms are useful for ONN training due to the constraints imposed by their physical implementation. This paper evaluates different learning methods with respect to their suitability for ONNs. It proposes a new approach, which is compared against previous works. The proposed method has been shown to produce competitive results in terms of pattern recognition accuracy with reduced precision in synaptic weights, and to be suitable for online learning.

18.
Sensors (Basel) ; 23(24)2023 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-38139588

RESUMO

Smart parking is an artificial intelligence-based solution to solve the challenges of inefficient utilization of parking slots, wasting time, congestion producing high CO2 emission levels, inflexible payment methods, and protecting parked vehicles from theft and vandalism. Nothing is worse than parking congestion caused by drivers looking for open spaces. This is common in large parking lots, underground garages, and multi-story car parks, where visibility is limited and signage can be confusing or difficult to read, so drivers have no idea where available parking spaces are. In this paper, a smart real-time parking management system has been introduced. The developed system can deal with the aforementioned challenges by providing dynamic allocation for parking slots while taking into consideration the overall parking situation, providing a mechanism for booking a specific parking slot by using our Artificial Intelligence (AI)-based application, and providing a mechanism to ensure that the car is parked in its correct place. For the sake of providing cost flexibility, we have provided two technical solutions with cost varying. The first solution is developed based on a motion sensor and the second solution is based on a range-finder sensor. A plate detection and recognition system has been used to detect the vehicle's license plate by capturing the image using an IoT device. The system will recognize the extracted English alphabet and Hindu-Arabic Numerals. The proposed solution was built and field-tested to prove the applicability of the proposed smart parking solution. We have measured and analyzed keen data such as vehicle plate detection accuracy, vehicle plate recognition accuracy, transmission delay time, and processing delay time.

20.
BMC Med Inform Decis Mak ; 23(1): 251, 2023 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-37932733

RESUMO

BACKGROUND: In the healthcare domain today, despite the substantial adoption of electronic health information systems, a significant proportion of medical reports still exist in paper-based formats. As a result, there is a significant demand for the digitization of information from these paper-based reports. However, the digitization of paper-based laboratory reports into a structured data format can be challenging due to their non-standard layouts, which includes various data types such as text, numeric values, reference ranges, and units. Therefore, it is crucial to develop a highly scalable and lightweight technique that can effectively identify and extract information from laboratory test reports and convert them into a structured data format for downstream tasks. METHODS: We developed an end-to-end Natural Language Processing (NLP)-based pipeline for extracting information from paper-based laboratory test reports. Our pipeline consists of two main modules: an optical character recognition (OCR) module and an information extraction (IE) module. The OCR module is applied to locate and identify text from scanned laboratory test reports using state-of-the-art OCR algorithms. The IE module is then used to extract meaningful information from the OCR results to form digitalized tables of the test reports. The IE module consists of five sub-modules, which are time detection, headline position, line normalization, Named Entity Recognition (NER) with a Conditional Random Fields (CRF)-based method, and step detection for multi-column. Finally, we evaluated the performance of the proposed pipeline on 153 laboratory test reports collected from Peking University First Hospital (PKU1). RESULTS: In the OCR module, we evaluate the accuracy of text detection and recognition results at three different levels and achieved an averaged accuracy of 0.93. In the IE module, we extracted four laboratory test entities, including test item name, test result, test unit, and reference value range. The overall F1 score is 0.86 on the 153 laboratory test reports collected from PKU1. With a single CPU, the average inference time of each report is only 0.78 s. CONCLUSION: In this study, we developed a practical lightweight pipeline to digitalize and extract information from paper-based laboratory test reports in diverse types and with different layouts that can be adopted in real clinical environments with the lowest possible computing resources requirements. The high evaluation performance on the real-world hospital dataset validated the feasibility of the proposed pipeline.


Assuntos
Algoritmos , Processamento de Linguagem Natural , Humanos , Armazenamento e Recuperação da Informação , Hospitais Universitários , Registros Eletrônicos de Saúde
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA