Pesquisa | Portal Regional da BVS

Consecutive multiscale feature learning-based image classification model.

Olimov, Bekhzod; Subramanian, Barathi; Ugli, Rakhmonov Akhrorjon Akhmadjon; Kim, Jea-Soo; Kim, Jeonghong.

Sci Rep ; 13(1): 3595, 2023 Mar 03.

Artigo em Inglês | MEDLINE | ID: mdl-36869132

RESUMO

Extracting useful features at multiple scales is a crucial task in computer vision. The emergence of deep-learning techniques and the advancements in convolutional neural networks (CNNs) have facilitated effective multiscale feature extraction that results in stable performance improvements in numerous real-life applications. However, currently available state-of-the-art methods primarily rely on a parallel multiscale feature extraction approach, and despite exhibiting competitive accuracy, the models lead to poor results in efficient computation and low generalization on small-scale images. Moreover, efficient and lightweight networks cannot appropriately learn useful features, and this causes underfitting when training with small-scale images or datasets with a limited number of samples. To address these problems, we propose a novel image classification system based on elaborate data preprocessing steps and a carefully designed CNN model architecture. Specifically, we present a consecutive multiscale feature-learning network (CMSFL-Net) that employs a consecutive feature-learning approach based on the usage of various feature maps with different receptive fields to achieve faster training/inference and higher accuracy. In the conducted experiments using six real-life image classification datasets, including small-scale, large-scale, and limited data, the CMSFL-Net exhibits an accuracy comparable with those of existing state-of-the-art efficient networks. Moreover, the proposed system outperforms them in terms of efficiency and speed and achieves the best results in accuracy-efficiency trade-off.

An integrated mediapipe-optimized GRU model for Indian sign language recognition.

Subramanian, Barathi; Olimov, Bekhzod; Naik, Shraddha M; Kim, Sangchul; Park, Kil-Houm; Kim, Jeonghong.

Sci Rep ; 12(1): 11964, 2022 07 13.

Artigo em Inglês | MEDLINE | ID: mdl-35831393

RESUMO

Sign language recognition is challenged by problems, such as accurate tracking of hand gestures, occlusion of hands, and high computational cost. Recently, it has benefited from advancements in deep learning techniques. However, these larger complex approaches cannot manage long-term sequential data and they are characterized by poor information processing and learning efficiency in capturing useful information. To overcome these challenges, we propose an integrated MediaPipe-optimized gated recurrent unit (MOPGRU) model for Indian sign language recognition. Specifically, we improved the update gate of the standard GRU cell by multiplying it by the reset gate to discard the redundant information from the past in one screening. By obtaining feedback from the resultant of the reset gate, additional attention is shown to the present input. Additionally, we replace the hyperbolic tangent activation in standard GRUs with exponential linear unit activation and SoftMax with Softsign activation in the output layer of the GRU cell. Thus, our proposed MOPGRU model achieved better prediction accuracy, high learning efficiency, information processing capability, and faster convergence than other sequential models.

Assuntos

Gestos , Língua de Sinais , Humanos , Idioma , Reconhecimento Psicológico

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA