Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Opt Soc Am A Opt Image Sci Vis ; 41(5): 863-873, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38856573

RESUMO

Advanced driver assistance systems (ADAS) rely on lane departure warning (LDW) technology to enhance safety while driving. However, the current LDW method is limited to cameras with standard angles of view, such as mono cameras and black boxes. In recent times, more cameras with ultra-wide-angle lenses are being used to save money and improve accuracy. However, this has led to some challenges such as fixing optical distortion, making the camera process images faster, and ensuring its performance. To effectively implement LDW, we developed three technologies: (i) distortion correction using error functions based on the projection characteristics of optical lenses, (ii) automatic vanishing point estimation using geometric characteristics, and (iii) lane tracking and lane departure detection using constraints. The proposed technology improves system stability and convenience through automatic calculation and updating of parameters required for LDW function operation. By performing automatic distortion correction and vanishing point estimation, it has also been proven that fusion with other ADAS systems including front cameras is possible. Existing systems that use vanishing point information do not consider lens distortion and have slow and inaccurate vanishing point estimation, leading to a deterioration of system performance. The proposed method enables fast and accurate vanishing point estimation, allowing for adaptive responses to changes in the road environment.

2.
Sci Rep ; 13(1): 21577, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38062184

RESUMO

In recent years, significant progress has been made in visual-linguistic multi-modality research, leading to advancements in visual comprehension and its applications in computer vision tasks. One fundamental task in visual-linguistic understanding is image captioning, which involves generating human-understandable textual descriptions given an input image. This paper introduces a referring expression image captioning model that incorporates the supervision of interesting objects. Our model utilizes user-specified object keywords as a prefix to generate specific captions that are relevant to the target object. The model consists of three modules including: (i) visual grounding, (ii) referring object selection, and (iii) image captioning modules. To evaluate its performance, we conducted experiments on the RefCOCO and COCO captioning datasets. The experimental results demonstrate that our proposed method effectively generates meaningful captions aligned with users' specific interests.

3.
Sci Rep ; 13(1): 18264, 2023 10 25.
Artigo em Inglês | MEDLINE | ID: mdl-37880264

RESUMO

This paper introduces a real-time Driver Monitoring System (DMS) designed to monitor driver behavior while driving, employing facial landmark estimation-based behavior recognition. The system utilizes an infrared (IR) camera to capture and analyze video data. Through facial landmark estimation, crucial information about the driver's head posture and eye area is extracted from the detected facial region, obtained via face detection. The proposed method consists of two distinct modules, each focused on recognizing specific behaviors. The first module employs head pose analysis to detect instances of inattention. By monitoring the driver's head movements along the horizontal and vertical axes, this module assesses the driver's attention level. The second module implements an eye-closure recognition filter to identify instances of drowsiness. Depending on the continuity of eye closures, the system categorizes them as either occasional drowsiness or sustained drowsiness. The advantages of the proposed method lie in its efficiency and real-time capabilities, as it solely relies on IR camera video for computation and analysis. To assess its performance, the system underwent evaluation using IR-Datasets, demonstrating its effectiveness in monitoring and recognizing driver behavior accurately. The presented real-time Driver Monitoring System with facial landmark-based behavior recognition offers a practical and robust approach to enhance driver safety and alertness during their journeys.


Assuntos
Condução de Veículo , Vigília , Atenção , Reconhecimento Psicológico , Sistemas Computacionais
4.
Sci Rep ; 13(1): 9062, 2023 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-37271757

RESUMO

Recently, many existing visual trackers have made significant progress by incorporating either spatial information from multi-level convolution layers or temporal information for tracking. However, the complementary advantages of both spatial and temporal information cannot be leveraged when these two types of information are used separately. In this paper, we present a new approach for robust visual tracking using a transformer-based model that incorporates both spatial and temporal context information at multiple levels. To integrate the refined similarity maps through multi-level spatial and temporal encoders, we propose an aggregation encoder. Consequently, the output of the proposed aggregation encoder contains useful features that integrate the global contexts of multi-level spatial and the temporal contexts. The feature we propose offers a contrasting yet complementary representation of multi-level spatial and temporal contexts. This characteristic is particularly beneficial in complex aerial scenarios, where tracking failures can occur due to occlusion, motion blur, small objects, and scale variations. Also, our tracker utilizes a light-weight network backbone, ensuring fast and effective object tracking in aerial datasets. Additionally, the proposed architecture can achieve more robust object tracking against significant variations by updating the features of the latest object while retaining the initial template information. Extensive experiments on seven challenging short-term and long-term aerial tracking benchmarks have demonstrated that the proposed tracker outperforms state-of-the-art tracking methods in terms of both real-time processing speed and performance.

5.
Sci Rep ; 13(1): 8088, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37208448

RESUMO

To increase the accuracy of medical image analysis using supervised learning-based AI technology, a large amount of accurately labeled training data is required. However, the supervised learning approach may not be applicable to real-world medical imaging due to the lack of labeled data, the privacy of patients, and the cost of specialized knowledge. To handle these issues, we utilized Kronecker-factored decomposition, which enhances both computational efficiency and stability of the learning process. We combined this approach with a model-agnostic meta-learning framework for the parameter optimization. Based on this method, we present a bidirectional meta-Kronecker factored optimizer (BM-KFO) framework to quickly optimize semantic segmentation tasks using just a few magnetic resonance imaging (MRI) images as input. This model-agnostic approach can be implemented without altering network components and is capable of learning the learning process and meta-initial points while training on previously unseen data. We also incorporated a combination of average Hausdorff distance loss (AHD-loss) and cross-entropy loss into our objective function to specifically target the morphology of organs or lesions in medical images. Through evaluation of the proposed method on the abdominal MRI dataset, we obtained an average performance of 78.07% in setting 1 and 79.85% in setting 2. Our experiments demonstrate that BM-KFO with AHD-loss is suitable for general medical image segmentation applications and achieves superior performance compared to the baseline method in few-shot learning tasks. In order to replicate the proposed method, we have shared our code on GitHub. The corresponding URL can be found: https://github.com/YeongjoonKim/BMKFO.git .


Assuntos
Conhecimento , Privacidade , Humanos , Entropia , Registros , Semântica , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética
6.
Neuropsychiatr Dis Treat ; 19: 851-863, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37077704

RESUMO

Purpose: Electroencephalogram (EEG) signals give detailed information on the electrical brain activities occurring in the cerebral cortex. They are used to study brain-related disorders such as mild cognitive impairment (MCI) and Alzheimer's disease (AD). Brain signals obtained using an EEG machine can be a neurophysiological biomarker for early diagnosis of dementia through quantitative EEG (qEEG) analysis. This paper proposes a machine learning methodology to detect MCI and AD from qEEG time-frequency (TF) images of the subjects in an eyes-closed resting state (ECR). Participants and Methods: The dataset consisted of 16,910 TF images from 890 subjects: 269 healthy controls (HC), 356 MCI, and 265 AD. First, EEG signals were transformed into TF images using a Fast Fourier Transform (FFT) containing different event-rated changes of frequency sub-bands preprocessed from the EEGlab toolbox in the MATLAB R2021a environment software. The preprocessed TF images were applied in a convolutional neural network (CNN) with adjusted parameters. For classification, the computed image features were concatenated with age data and went through the feed-forward neural network (FNN). Results: The trained models', HC vs MCI, HC vs AD, and HC vs CASE (MCI + AD), performance metrics were evaluated based on the test dataset of the subjects. The accuracy, sensitivity, and specificity were evaluated: HC vs MCI was 83%, 93%, and 73%, HC vs AD was 81%, 80%, and 83%, and HC vs CASE (MCI + AD) was 88%, 80%, and 90%, respectively. Conclusion: The proposed models trained with TF images and age can be used to assist clinicians as a biomarker in detecting cognitively impaired subjects at an early stage in clinical sectors.

7.
Neuroimage ; 272: 120054, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-36997138

RESUMO

For automatic EEG diagnosis, this paper presents a new EEG data set with well-organized clinical annotations called Chung-Ang University Hospital EEG (CAUEEG), which has event history, patient's age, and corresponding diagnosis labels. We also designed two reliable evaluation tasks for the low-cost, non-invasive diagnosis to detect brain disorders: i) CAUEEG-Dementia with normal, mci, and dementia diagnostic labels and ii) CAUEEG-Abnormal with normal and abnormal. Based on the CAUEEG dataset, this paper proposes a new fully end-to-end deep learning model, called the CAUEEG End-to-end Deep neural Network (CEEDNet). CEEDNet pursues to bring all the functional elements for the EEG analysis in a seamless learnable fashion while restraining non-essential human intervention. Extensive experiments showed that our CEEDNet significantly improves the accuracy compared with existing methods, such as machine learning methods and Ieracitano-CNN (Ieracitano et al., 2019), due to taking full advantage of end-to-end learning. The high ROC-AUC scores of 0.9 on CAUEEG-Dementia and 0.86 on CAUEEG-Abnormal recorded by our CEEDNet models demonstrate that our method can lead potential patients to early diagnosis through automatic screening.


Assuntos
Disfunção Cognitiva , Aprendizado Profundo , Demência , Humanos , Eletroencefalografia/métodos , Algoritmos , Disfunção Cognitiva/diagnóstico , Demência/diagnóstico
8.
Sci Rep ; 13(1): 244, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36604562

RESUMO

Recent advances in deep learning realized accurate, robust detection of various types of objects including pedestrians on the road, defect regions in the manufacturing process, human organs in medical images, and dangerous materials passing through the airport checkpoint. Specifically, small object detection implemented as an embedded system is gaining increasing attention for autonomous vehicles, drone reconnaissance, and microscopic imagery. In this paper, we present a light-weight small object detection model using two plug-in modules: (1) high-resolution processing module (HRPM ) and (2) sigmoid fusion module (SFM). The HRPM efficiently learns multi-scale features of small objects using a significantly reduced computational cost, and the SFM alleviates mis-classification errors due to spatial noise by adjusting weights on the lost small object information. Combination of HRPM and SFM significantly improved the detection accuracy with a low amount of computation. Compared with the original YOLOX-s model, the proposed model takes a two-times higher-resolution input image for higher mean average precision (mAP) using 57% model parameters and 71% computation in Gflops. The proposed model was tested using real drone reconnaissance images, and provided significant improvement in detecting small vehicles.

9.
Opt Express ; 30(13): 23608-23621, 2022 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-36225037

RESUMO

In this paper, we present a novel low-light image enhancement method by combining optimization-based decomposition and enhancement network for simultaneously enhancing brightness and contrast. The proposed method works in two steps including Retinex decomposition and illumination enhancement, and can be trained in an end-to-end manner. The first step separates the low-light image into illumination and reflectance components based on the Retinex model. Specifically, it performs model-based optimization followed by learning for edge-preserved illumination smoothing and detail-preserved reflectance denoising. In the second step, the illumination output from the first step, together with its gamma corrected and histogram equalized versions, serves as input to illumination enhancement network (IEN) including residual squeeze and excitation blocks (RSEBs). Extensive experiments prove that our method shows better performance compared with state-of-the-art low-light enhancement methods in the sense of both objective and subjective measures.

10.
Sensors (Basel) ; 21(18)2021 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-34577388

RESUMO

Physical model-based dehazing methods cannot, in general, avoid environmental variables and undesired artifacts such as non-collected illuminance, halo and saturation since it is difficult to accurately estimate the amount of the illuminance, light transmission and airlight. Furthermore, the haze model estimation process requires very high computational complexity. To solve this problem by directly estimating the radiance of the haze images, we present a novel dehazing and verifying network (DVNet). In the dehazing procedure, we enhanced the clean images by using a correction network (CNet), which uses the ground truth to learn the haze network. Haze images are then restored through a haze network (HNet). Furthermore, a verifying method verifies the error of both CNet and HNet using a self-supervised learning method. Finally, the proposed complementary adversarial learning method can produce results more naturally. Note that the proposed discriminator and generators (HNet & CNet) can be learned via an unpaired dataset. Overall, the proposed DVNet can generate a better dehazed result than state-of-the-art approaches under various hazy conditions. Experimental results show that the DVNet outperforms state-of-the-art dehazing methods in most cases.

11.
Sensors (Basel) ; 20(21)2020 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-33167486

RESUMO

In a hazy environment, visibility is reduced and objects are difficult to identify. For this reason, many dehazing techniques have been proposed to remove the haze. Especially, in the case of the atmospheric scattering model estimation-based method, there is a problem of distortion when inaccurate models are estimated. We present a novel residual-based dehazing network model to overcome the performance limitation in an atmospheric scattering model-based method. More specifically, the proposed model adopted the gate fusion network that generates the dehazed results using a residual operator. To further reduce the divergence between the clean and dehazed images, the proposed discriminator distinguishes dehazed results and clean images, and then reduces the statistical difference via adversarial learning. To verify each element of the proposed model, we hierarchically performed the haze removal process in an ablation study. Experimental results show that the proposed method outperformed state-of-the-art approaches in terms of peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), international commission on illumination cie delta e 2000 (CIEDE2000), and mean squared error (MSE). It also gives subjectively high-quality images without color distortion or undesired artifacts for both synthetic and real-world hazy images.

12.
Sensors (Basel) ; 20(17)2020 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-32872299

RESUMO

Recent advances in object tracking based on deep Siamese networks shifted the attention away from correlation filters. However, the Siamese network alone does not have as high accuracy as state-of-the-art correlation filter-based trackers, whereas correlation filter-based trackers alone have a frame update problem. In this paper, we present a Siamese network with spatially semantic correlation features (SNS-CF) for accurate, robust object tracking. To deal with various types of features spread in many regions of the input image frame, the proposed SNS-CF consists of-(1) a Siamese feature extractor, (2) a spatially semantic feature extractor, and (3) an adaptive correlation filter. To the best of authors knowledge, the proposed SNS-CF is the first attempt to fuse the Siamese network and the correlation filter to provide high frame rate, real-time visual tracking with a favorable tracking performance to the state-of-the-art methods in multiple benchmarks.

13.
Sensors (Basel) ; 20(14)2020 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-32668715

RESUMO

Various action recognition approaches have recently been proposed with the aid of three-dimensional (3D) convolution and a multiple stream structure. However, existing methods are sensitive to background and optical flow noise, which prevents from learning the main object in a video frame. Furthermore, they cannot reflect the accuracy of each stream in the process of combining multiple streams. In this paper, we present a novel action recognition method that improves the existing method using optical flow and a multi-stream structure. The proposed method consists of two parts: (i) optical flow enhancement process using image segmentation and (ii) score fusion process by applying weighted sum of the accuracy. The enhancement process can help the network to efficiently analyze the flow information of the main object in the optical flow frame, thereby improving accuracy. A different accuracy of each stream can be reflected to the fused score while using the proposed score fusion method. We achieved an accuracy of 98.2% on UCF-101 and 82.4% on HMDB-51. The proposed method outperformed many state-of-the-art methods without changing the network structure and it is expected to be easily applied to other networks.

14.
Sensors (Basel) ; 20(12)2020 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-32604850

RESUMO

Person re-identification (Re-ID) has a problem that makes learning difficult such as misalignment and occlusion. To solve these problems, it is important to focus on robust features in intra-class variation. Existing attention-based Re-ID methods focus only on common features without considering distinctive features. In this paper, we present a novel attentive learning-based Siamese network for person Re-ID. Unlike existing methods, we designed an attention module and attention loss using the properties of the Siamese network to concentrate attention on common and distinctive features. The attention module consists of channel attention to select important channels and encoder-decoder attention to observe the whole body shape. We modified the triplet loss into an attention loss, called uniformity loss. The uniformity loss generates a unique attention map, which focuses on both common and discriminative features. Extensive experiments show that the proposed network compares favorably to the state-of-the-art methods on three large-scale benchmarks including Market-1501, CUHK03 and DukeMTMC-ReID datasets.


Assuntos
Biometria/instrumentação , Aprendizado Profundo , Humanos
15.
Sensors (Basel) ; 20(9)2020 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-32397536

RESUMO

To encourage people to save energy, subcompact cars have several benefits of discount on parking or toll road charge. However, manual classification of the subcompact car is highly labor intensive. To solve this problem, automatic vehicle classification systems are good candidates. Since a general pattern-based classification technique can not successfully recognize the ambiguous features of a vehicle, we present a new multi-resolution convolutional neural network (CNN) and stochastic orthogonal learning method to train the network. We first extract the region of a bonnet in the vehicle image. Next, both extracted and input image are engaged to low and high resolution layers in the CNN model. The proposed network is then optimized based on stochastic orthogonality. We also built a novel subcompact vehicle dataset that will be open for a public use. Experimental results show that the proposed model outperforms state-of-the-art approaches in term of accuracy, which means that the proposed method can efficiently classify the ambiguous features between subcompact and non-subcompact vehicles.

16.
Sensors (Basel) ; 19(22)2019 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-31717609

RESUMO

Online training framework based on discriminative correlation filters for visual tracking has recently shown significant improvement in both accuracy and speed. However, correlation filter-base discriminative approaches have a common problem of tracking performance degradation when the local structure of a target is distorted by the boundary effect problem. The shape distortion of the target is mainly caused by the circulant structure in the Fourier domain processing, and it makes the correlation filter learn distorted training samples. In this paper, we present a structure-attention network to preserve the target structure from the structure distortion caused by the boundary effect. More specifically, we adopt a variational auto-encoder as a structure-attention network to make various and representative target structures. We also proposed two denoising criteria using a novel reconstruction loss for variational auto-encoding framework to capture more robust structures even under the boundary condition. Through the proposed structure-attention framework, discriminative correlation filters can learn robust structure information of targets during online training with an enhanced discriminating performance and adaptability. Experimental results on major visual tracking benchmark datasets show that the proposed method produces a better or comparable performance compared with the state-of-the-art tracking methods with a real-time processing speed of more than 80 frames per second.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Algoritmos
17.
J Opt Soc Am A Opt Image Sci Vis ; 36(10): 1768-1776, 2019 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-31674442

RESUMO

In stereo-matching techniques for three-dimensional (3D) vision, illumination change is a major problem that degrades matching accuracy. When large intensity differences are observed between a pair of stereos, it is difficult to find the similarity in the matching process. In addition, inaccurately estimated disparities are obtained in textureless regions, since there are no distinguishable features in the region. To solve these problems, this paper presents a robust stereo-matching method using illuminant-invariant cost volume and confidence-based disparity refinement. In the step of matching a stereo pair, the proposed method combines two cost volumes using an invariant image and Weber local descriptor (WLD), which was originally motivated by human visual characteristics. The invariant image used in the matching step is insensitive to sudden brightness changes by shadow or light sources, and WLD reflects structural features of the invariant image with consideration of a gradual illumination change. After aggregating the cost using a guided filter, we refine the initially estimated disparity map based on the confidence map computed by the combined cost volume. Experimental results verify that the matching computation of the proposed method improves the accuracy of the disparity map under a radiometrically dynamic environment. Since the proposed disparity refinement method can also reduce the error of the initial disparity map in textureless areas, it can be applied to various 3D vision systems such as industrial robots and autonomous vehicles.

18.
Opt Express ; 27(19): 26600-26614, 2019 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-31674538

RESUMO

Calibration of a vehicle camera is a key technology for advanced driver assistance systems (ADAS). This paper presents a novel estimation method to measure the orientation of a camera that is mounted on a driving vehicle. By considering the characteristics of vehicle cameras and driving environment, we detect three orthogonal vanishing points as a basis of the imaging geometry. The proposed method consists of three steps: i) detection of lines projected to the Gaussian sphere and extraction of the plane normal, ii) estimation of the vanishing point about the optical axis using linear Hough transform, and iii) voting for the rest two vanishing points using circular histogram. The proposed method increases both accuracy and stability by considering the practical driving situation using sequentially estimated three vanishing points. In addition, we can rapidly estimate the orientation by converting the voting space into a 2D plane at each stage. As a result, the proposed method can quickly and accurately estimate the orientation of the vehicle camera in a normal driving situation.

19.
Sensors (Basel) ; 19(21)2019 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-31683664

RESUMO

For sustainable operation and maintenance of urban railway infrastructure, intelligent visual inspection of the railway infrastructure attracts increasing attention to avoid unreliable, manual observation by humans at night, while trains do not operate. Although various automatic approaches were proposed using image processing and computer vision techniques, most of them are focused only on railway tracks. In this paper, we present a novel railway inspection system using facility detection based on deep convolutional neural network and computer vision-based image comparison approach. The proposed system aims to automatically detect wears and cracks by comparing a pair of corresponding image sets acquired at different times. We installed line scan camera on the roof of the train. Unlike an area-based camera, the line scan camera quickly acquires images with a wide field of view. The proposed system consists of three main modules: (i) image reconstruction for registration of facility positions, (ii) facility detection using an improved single shot detector, and (iii) deformed region detection using image processing and computer vision techniques. In experiments, we demonstrate that the proposed system accurately finds facilities and detects their potential defects. For that reason, the proposed system can provide various advantages such as cost reduction for maintenance and accident prevention.

20.
Sensors (Basel) ; 18(10)2018 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-30241286

RESUMO

Single-lens-based optical range finding systems were developed as an efficient, compact alternative for conventional stereo camera systems. Among various single-lens-based approaches, a multiple color-filtered aperture (MCA) system can generate disparity information among color channels, as well as normal color information. In this paper, we consider a dual color-filtered aperture (DCA) system as the most minimal version of the MCA system and present a novel inter-color image registration algorithm for disparity estimation. This proposed registration algorithm consists of three steps: (i) color channel independent feature extraction; (ii) feature-based adaptive weight disparity estimation; and (iii) color mapping matrix (CMM)-based cross-channel image registration. Experimental results show that the proposed method can not only generate an accurate disparity map, but also realize high quality cross-channel registration with a disparity prior for DCA-based range finding and color image enhancement.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...