Pesquisa | Portal Regional da BVS (teste)

1.

Transforming Poultry Farming: A Pyramid Vision Transformer Approach for Accurate Chicken Counting in Smart Farm Environments.

Khanal, Ridip; Choi, Yoochan; Lee, Joonwhoan.

Sensors (Basel) ; 24(10)2024 May 08.

Artigo em Inglês | MEDLINE | ID: mdl-38793832

RESUMO

Smart farm environments, equipped with cutting-edge technology, require proficient techniques for managing poultry. This research investigates automated chicken counting, an essential part of optimizing livestock conditions. By integrating artificial intelligence and computer vision, it introduces a transformer-based chicken-counting model to overcome challenges to precise counting, such as lighting changes, occlusions, cluttered backgrounds, continual chicken growth, and camera distortions. The model includes a pyramid vision transformer backbone and a multi-scale regression head to predict precise density maps of the crowded chicken enclosure. The customized loss function incorporates curriculum loss, allowing the model to learn progressively, and adapts to diverse challenges posed by varying densities, scales, and appearances. The proposed annotated dataset includes data on various lighting conditions, chicken sizes, densities, and placements. Augmentation strategies enhanced the dataset with brightness, contrast, shadow, blur, occlusion, cropping, and scaling variations. Evaluating the model on the proposed dataset indicated its robustness, with a validation mean absolute error of 27.8, a root mean squared error of 40.9, and a test average accuracy of 96.9%. A comparison with the few-shot object counting model SAFECount demonstrated the model's superior accuracy and resilience. The transformer-based approach was 7.7% more accurate than SAFECount. It demonstrated robustness in response to different challenges that may affect counting and offered a comprehensive and effective solution for automated chicken counting in smart farm environments.

Assuntos

Inteligência Artificial , Galinhas , Fazendas , Animais , Criação de Animais Domésticos/métodos , Algoritmos , Agricultura/métodos

2.

A Review on Recent Deep Learning-Based Semantic Segmentation for Urban Greenness Measurement.

Lee, Doo Hong; Park, Hye Yeon; Lee, Joonwhoan.

Sensors (Basel) ; 24(7)2024 Mar 31.

Artigo em Inglês | MEDLINE | ID: mdl-38610456

RESUMO

Accurate urban green space (UGS) measurement has become crucial for landscape analysis. This paper reviews the recent technological breakthroughs in deep learning (DL)-based semantic segmentation, emphasizing efficient landscape analysis, and integrating greenness measurements. It explores quantitative greenness measures applied through semantic segmentation, categorized into the plan view- and the perspective view-based methods, like the Land Class Classification (LCC) with green objects and the Green View Index (GVI) based on street photographs. This review navigates from traditional to modern DL-based semantic segmentation models, illuminating the evolution of the urban greenness measures and segmentation tasks for advanced landscape analysis. It also presents the typical performance metrics and explores public datasets for constructing these measures. The results show that accurate (semantic) segmentation is inevitable not only for fine-grained greenness measures but also for the qualitative evaluation of landscape analyses for planning amidst the incomplete explainability of the DL model. Also, the unsupervised domain adaptation (UDA) in aerial images is addressed to overcome the scale changes and lack of labeled data for fine-grained greenness measures. This review contributes to helping researchers understand the recent breakthroughs in DL-based segmentation technology for challenging topics in UGS research.

3.

Open set classification of sound event.

You, Jie; Wu, Wenqin; Lee, Joonwhoan.

Sci Rep ; 14(1): 1282, 2024 Jan 13.

Artigo em Inglês | MEDLINE | ID: mdl-38218958

RESUMO

Sound is one of the primary forms of sensory information that we use to perceive our surroundings. Usually, a sound event is a sequence of an audio clip obtained from an action. The action can be rhythm patterns, music genre, people speaking for a few seconds, etc. The sound event classification address distinguishes what kind of audio clip it is from the given audio sequence. Nowadays, it is a common issue to solve in the following pipeline: audio pre-processingâperceptual feature extractionâclassification algorithm. In this paper, we improve the traditional sound event classification algorithm to identify unknown sound events by using the deep learning method. The compact cluster structure in the feature space for known classes helps recognize unknown classes by allowing large room to locate unknown samples in the embedded feature space. Based on this concept, we applied center loss and supervised contrastive loss to optimize the model. The center loss tries to minimize the intra- class distance by pulling the embedded feature into the cluster center, while the contrastive loss disperses the inter-class features from one another. In addition, we explored the performance of self-supervised learning in detecting unknown sound events. The experimental results demonstrate that our proposed open-set sound event classification algorithm and self-supervised learning approach achieve sustained performance improvements in various datasets.

4.

Detection of unknown strawberry diseases based on OpenMatch and two-head network for continual learning.

Jiang, Kan; You, Jie; Dorj, Ulzii-Orshikh; Kim, Hyongsuk; Lee, Joonwhoan.

Front Plant Sci ; 13: 989086, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36186017

RESUMO

For continual learning in the process of plant disease recognition it is necessary to first distinguish between unknown diseases from those of known diseases. This paper deals with two different but related deep learning techniques for the detection of unknown plant diseases; Open Set Recognition (OSR) and Out-of-Distribution (OoD) detection. Despite the significant progress in OSR, it is still premature to apply it to fine-grained recognition tasks without outlier exposure that a certain part of OoD data (also called known unknowns) are prepared for training. On the other hand, OoD detection requires intentionally prepared outlier data during training. This paper analyzes two-head network included in OoD detection models, and semi-supervised OpenMatch associated with OSR technology, which explicitly and implicitly assume outlier exposure, respectively. For the experiment, we built an image dataset of eight strawberry diseases. In general, a two-head network and OpenMatch cannot be compared due to different training settings. In our experiment, we changed their training procedures to make them similar for comparison and show that modified training procedures resulted in reasonable performance, including more than 90% accuracy for strawberry disease classification as well as detection of unknown diseases. Accurate detection of unknown diseases is an important prerequisite for continued learning.

5.

Deep Metric Learning-Based Strawberry Disease Detection With Unknowns.

You, Jie; Jiang, Kan; Lee, Joonwhoan.

Front Plant Sci ; 13: 891785, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35860535

RESUMO

There has been substantial research that has achieved significant advancements in plant disease detection based on deep object detection models. However, with unknown diseases, it is difficult to find a practical solution for plant disease detection. This study proposes a simple but effective strawberry disease detection scheme with unknown diseases that can provide applicable performance in the real field. In the proposed scheme, the known strawberry diseases are detected with deep metric learning (DML)-based classifiers along with the unknown diseases that have certain symptoms. The pipeline of our proposed scheme consists of two stages: the first is object detection with known disease classes, while the second is a DML-based post-filtering stage. The second stage has two different types of classifiers: one is softmax classifiers that are only for known diseases and the K-nearest neighbor (K-NN) classifier for both known and unknown diseases. In the training of the first stage and the DML-based softmax classifier, we only use the known samples of the strawberry disease. Then, we include the known (a priori) and the known unknown training samples to construct the K-NN classifier. The final decisions regarding known diseases are made from the combined results of the two classifiers, while unknowns are detected from the K-NN classifier. The experimental results show that the DML-based post-filter is effective at improving the performance of known disease detection in terms of mAP. Furthermore, the separate DML-based K-NN classifier provides high recall and precision for known and unknown diseases and achieve 97.8% accuracy, meaning it could be exploited as a Region of Interest (ROI) classifier. For the real field data, the proposed scheme achieves a high mAP of 93.7% to detect known classes of strawberry disease, and it also achieves reasonable results for unknowns. This implies that the proposed scheme can be applied to identify disease-like symptoms caused by real known and unknown diseases or disorders for any kind of plant.

6.

Transformer-Based Weed Segmentation for Grass Management.

Jiang, Kan; Afzaal, Usman; Lee, Joonwhoan.

Sensors (Basel) ; 23(1)2022 Dec 21.

Artigo em Inglês | MEDLINE | ID: mdl-36616662

RESUMO

Weed control is among the most challenging issues for crop cultivation and turf grass management. In addition to hosting various insects and plant pathogens, weeds compete with crop for nutrients, water and sunlight. This results in problems such as the loss of crop yield, the contamination of food crops and disruption in the field aesthetics and practicality. Therefore, effective and efficient weed detection and mapping methods are indispensable. Deep learning (DL) techniques for the rapid recognition and localization of objects from images or videos have shown promising results in various areas of interest, including the agricultural sector. Attention-based Transformer models are a promising alternative to traditional constitutional neural networks (CNNs) and offer state-of-the-art results for multiple tasks in the natural language processing (NLP) domain. To this end, we exploited these models to address the aforementioned weed detection problem with potential applications in automated robots. Our weed dataset comprised of 1006 images for 10 weed classes, which allowed us to develop deep learning-based semantic segmentation models for the localization of these weed classes. The dataset was further augmented to cater for the need of a large sample set of the Transformer models. A study was conducted to evaluate the results of three types of Transformer architectures, which included Swin Transformer, SegFormer and Segmenter, on the dataset, with SegFormer achieving final Mean Accuracy (mAcc) and Mean Intersection of Union (mIoU) of 75.18% and 65.74%, while also being the least computationally expensive, with just 3.7 M parameters.

Assuntos

Poaceae , Controle de Plantas Daninhas , Controle de Plantas Daninhas/métodos , Plantas Daninhas , Redes Neurais de Computação , Agricultura/métodos

7.

An Instance Segmentation Model for Strawberry Diseases Based on Mask R-CNN.

Afzaal, Usman; Bhattarai, Bhuwan; Pandeya, Yagya Raj; Lee, Joonwhoan.

Sensors (Basel) ; 21(19)2021 Sep 30.

Artigo em Inglês | MEDLINE | ID: mdl-34640893

RESUMO

Plant diseases must be identified at the earliest stage for pursuing appropriate treatment procedures and reducing economic and quality losses. There is an indispensable need for low-cost and highly accurate approaches for diagnosing plant diseases. Deep neural networks have achieved state-of-the-art performance in numerous aspects of human life including the agriculture sector. The current state of the literature indicates that there are a limited number of datasets available for autonomous strawberry disease and pest detection that allow fine-grained instance segmentation. To this end, we introduce a novel dataset comprised of 2500 images of seven kinds of strawberry diseases, which allows developing deep learning-based autonomous detection systems to segment strawberry diseases under complex background conditions. As a baseline for future works, we propose a model based on the Mask R-CNN architecture that effectively performs instance segmentation for these seven diseases. We use a ResNet backbone along with following a systematic approach to data augmentation that allows for segmentation of the target diseases under complex environmental conditions, achieving a final mean average precision of 82.43%.

Assuntos

Fragaria , Processamento de Imagem Assistida por Computador , Humanos , Redes Neurais de Computação , Doenças das Plantas

8.

Music video emotion classification using slow-fast audio-video network and unsupervised feature representation.

Pandeya, Yagya Raj; Bhattarai, Bhuwan; Lee, Joonwhoan.

Sci Rep ; 11(1): 19834, 2021 10 06.

Artigo em Inglês | MEDLINE | ID: mdl-34615904

RESUMO

Affective computing has suffered by the precise annotation because the emotions are highly subjective and vague. The music video emotion is complex due to the diverse textual, acoustic, and visual information which can take the form of lyrics, singer voice, sounds from the different instruments, and visual representations. This can be one reason why there has been a limited study in this domain and no standard dataset has been produced before now. In this study, we proposed an unsupervised method for music video emotion analysis using music video contents on the Internet. We also produced a labelled dataset and compared the supervised and unsupervised methods for emotion classification. The music and video information are processed through a multimodal architecture with audio-video information exchange and boosting method. The general 2D and 3D convolution networks compared with the slow-fast network with filter and channel separable convolution in multimodal architecture. Several supervised and unsupervised networks were trained in an end-to-end manner and results were evaluated using various evaluation metrics. The proposed method used a large dataset for unsupervised emotion classification and interpreted the results quantitatively and qualitatively in the music video that had never been applied in the past. The result shows a large increment in classification score using unsupervised features and information sharing techniques on audio and video network. Our best classifier attained 77% accuracy, an f1-score of 0.77, and an area under the curve score of 0.94 with minimum computational cost.

Assuntos

Emoções , Aprendizado de Máquina , Modelos Teóricos , Música , Gravação em Vídeo/classificação , Bases de Dados Factuais , Curva ROC

9.

Deep-Learning-Based Multimodal Emotion Classification for Music Videos.

Pandeya, Yagya Raj; Bhattarai, Bhuwan; Lee, Joonwhoan.

Sensors (Basel) ; 21(14)2021 Jul 20.

Artigo em Inglês | MEDLINE | ID: mdl-34300666

RESUMO

Music videos contain a great deal of visual and acoustic information. Each information source within a music video influences the emotions conveyed through the audio and video, suggesting that only a multimodal approach is capable of achieving efficient affective computing. This paper presents an affective computing system that relies on music, video, and facial expression cues, making it useful for emotional analysis. We applied the audio-video information exchange and boosting methods to regularize the training process and reduced the computational costs by using a separable convolution strategy. In sum, our empirical findings are as follows: (1) Multimodal representations efficiently capture all acoustic and visual emotional clues included in each music video, (2) the computational cost of each neural network is significantly reduced by factorizing the standard 2D/3D convolution into separate channels and spatiotemporal interactions, and (3) information-sharing methods incorporated into multimodal representations are helpful in guiding individual information flow and boosting overall performance. We tested our findings across several unimodal and multimodal networks against various evaluation metrics and visual analyzers. Our best classifier attained 74% accuracy, an f1-score of 0.73, and an area under the curve score of 0.926.

Assuntos

Aprendizado Profundo , Música , Emoções , Expressão Facial , Redes Neurais de Computação

10.

Improved Vision-Based Detection of Strawberry Diseases Using a Deep Neural Network.

Kim, Byoungjun; Han, You-Kyoung; Park, Jong-Han; Lee, Joonwhoan.

Front Plant Sci ; 11: 559172, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33584739

RESUMO

Detecting plant diseases in the earliest stages, when remedial intervention is most effective, is critical if damage crop quality and farm productivity is to be contained. In this paper, we propose an improved vision-based method of detecting strawberry diseases using a deep neural network (DNN) capable of being incorporated into an automated robot system. In the proposed approach, a backbone feature extractor named PlantNet, pre-trained on the PlantCLEF plant dataset from the LifeCLEF 2017 challenge, is installed in a two-stage cascade disease detection model. PlantNet captures plant domain knowledge so well that it outperforms a pre-trained backbone using an ImageNet-type public dataset by at least 3.2% in mean Average Precision (mAP). The cascade detector also improves accuracy by up to 5.25% mAP. The results indicate that PlantNet is one way to overcome the lack-of-annotated-data problem by applying plant domain knowledge, and that the human-like cascade detection strategy effectively improves the accuracy of automated disease detection methods when applied to strawberry plants.

11.

Geometric feature-based facial expression recognition in image sequences using multi-class AdaBoost and support vector machines.

Ghimire, Deepak; Lee, Joonwhoan.

Sensors (Basel) ; 13(6): 7714-34, 2013 Jun 14.

Artigo em Inglês | MEDLINE | ID: mdl-23771158

RESUMO

Facial expressions are widely used in the behavioral interpretation of emotions, cognitive science, and social interactions. In this paper, we present a novel method for fully automatic facial expression recognition in facial image sequences. As the facial expression evolves over time facial landmarks are automatically tracked in consecutive video frames, using displacements based on elastic bunch graph matching displacement estimation. Feature vectors from individual landmarks, as well as pairs of landmarks tracking results are extracted, and normalized, with respect to the first frame in the sequence. The prototypical expression sequence for each class of facial expression is formed, by taking the median of the landmark tracking results from the training facial expression sequences. Multi-class AdaBoost with dynamic time warping similarity distance between the feature vector of input facial expression and prototypical facial expression, is used as a weak classifier to select the subset of discriminative feature vectors. Finally, two methods for facial expression recognition are presented, either by using multi-class AdaBoost with dynamic time warping, or by using support vector machine on the boosted feature vectors. The results on the Cohn-Kanade (CK+) facial expression database show a recognition accuracy of 95.17% and 97.35% using multi-class AdaBoost and support vector machines, respectively.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA