Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Image Process ; 31: 6412-6423, 2022.
Article in English | MEDLINE | ID: mdl-36256692

ABSTRACT

Person re-identification is a problem of identifying individuals across non-overlapping cameras. Although remarkable progress has been made in the re-identification problem, it is still a challenging problem due to appearance variations of the same person as well as other people of similar appearance. Some prior works solved the issues by separating features of positive samples from features of negative ones. However, the performances of existing models considerably depend on the characteristics and statistics of the samples used for training. Thus, we propose a novel framework named sampling independent robust feature representation network (SirNet) that learns disentangled feature embedding from randomly chosen samples. A carefully designed sampling independent maximum discrepancy loss is introduced to model samples of the same person as a cluster. As a result, the proposed framework can generate additional hard negatives/positives using the learned features, which results in better discriminability from other identities. Extensive experimental results on large-scale benchmark datasets verify that the proposed model is more effective than prior state-of-the-art models.


Subject(s)
Biometric Identification , Humans , Biometric Identification/methods
2.
IEEE Trans Image Process ; 31: 3309-3321, 2022.
Article in English | MEDLINE | ID: mdl-35482698

ABSTRACT

Color filter array is a spatial multiplexing of pixel-sized filters fabricated over pixel sensors in most color image sensors. The state-of-the-art lossless coding techniques of raw sensor data captured by such sensors leverage spatial or cross-color correlation using lifting schemes. In this paper, we propose a lifting-based lossless white balance algorithm. When applied to the raw sensor data, the spatial bandwidth of the implied chrominance signals decreases. We propose to use this white balance as a pre-processing step to lossless CFA subsampled image/video compression, improving the overall coding efficiency of the raw sensor data.

3.
Sensors (Basel) ; 21(11)2021 May 27.
Article in English | MEDLINE | ID: mdl-34071901

ABSTRACT

Motion in videos refers to the pattern of the apparent movement of objects, surfaces, and edges over image sequences caused by the relative movement between a camera and a scene. Motion, as well as scene appearance, are essential features to estimate a driver's visual attention allocation in computer vision. However, the fact that motion can be a crucial factor in a driver's attention estimation has not been thoroughly studied in the literature, although driver's attention prediction models focusing on scene appearance have been well studied. Therefore, in this work, we investigate the usefulness of motion information in estimating a driver's visual attention. To analyze the effectiveness of motion information, we develop a deep neural network framework that provides attention locations and attention levels using optical flow maps, which represent the movements of contents in videos. We validate the performance of the proposed motion-based prediction model by comparing it to the performance of the current state-of-art prediction models using RGB frames. The experimental results for a real-world dataset confirm our hypothesis that motion plays a role in prediction accuracy improvement, and there is a margin for accuracy improvement by using motion features.


Subject(s)
Automobile Driving , Optic Flow , Motion , Neural Networks, Computer
4.
Sensors (Basel) ; 20(7)2020 Apr 04.
Article in English | MEDLINE | ID: mdl-32260397

ABSTRACT

Driving is a task that puts heavy demands on visual information, thereby the human visual system plays a critical role in making proper decisions for safe driving. Understanding a driver's visual attention and relevant behavior information is a challenging but essential task in advanced driver-assistance systems (ADAS) and efficient autonomous vehicles (AV). Specifically, robust prediction of a driver's attention from images could be a crucial key to assist intelligent vehicle systems where a self-driving car is required to move safely interacting with the surrounding environment. Thus, in this paper, we investigate a human driver's visual behavior in terms of computer vision to estimate the driver's attention locations in images. First, we show that feature representations at high resolution improves visual attention prediction accuracy and localization performance when being fused with features at low-resolution. To demonstrate this, we employ a deep convolutional neural network framework that learns and extracts feature representations at multiple resolutions. In particular, the network maintains the feature representation with the highest resolution at the original image resolution. Second, attention prediction tends to be biased toward centers of images when neural networks are trained using typical visual attention datasets. To avoid overfitting to the center-biased solution, the network is trained using diverse regions of images. Finally, the experimental results verify that our proposed framework improves the prediction accuracy of a driver's attention locations.


Subject(s)
Attention , Automobile Driving , Visual Perception , Decision Making , Humans , Neural Networks, Computer
5.
Abdom Radiol (NY) ; 44(6): 2030-2039, 2019 06.
Article in English | MEDLINE | ID: mdl-30460529

ABSTRACT

PURPOSE: The purpose of the study was to propose a deep transfer learning (DTL)-based model to distinguish indolent from clinically significant prostate cancer (PCa) lesions and to compare the DTL-based model with a deep learning (DL) model without transfer learning and PIRADS v2 score on 3 Tesla multi-parametric MRI (3T mp-MRI) with whole-mount histopathology (WMHP) validation. METHODS: With IRB approval, 140 patients with 3T mp-MRI and WMHP comprised the study cohort. The DTL-based model was trained on 169 lesions in 110 arbitrarily selected patients and tested on the remaining 47 lesions in 30 patients. We compared the DTL-based model with the same DL model architecture trained from scratch and the classification based on PIRADS v2 score with a threshold of 4 using accuracy, sensitivity, specificity, and area under curve (AUC). Bootstrapping with 2000 resamples was performed to estimate the 95% confidence interval (CI) for AUC. RESULTS: After training on 169 lesions in 110 patients, the AUC of discriminating indolent from clinically significant PCa lesions of the DTL-based model, DL model without transfer learning and PIRADS v2 score ≥ 4 were 0.726 (CI [0.575, 0.876]), 0.687 (CI [0.532, 0.843]), and 0.711 (CI [0.575, 0.847]), respectively, in the testing set. The DTL-based model achieved higher AUC compared to the DL model without transfer learning and PIRADS v2 score ≥ 4 in discriminating clinically significant lesions in the testing set. CONCLUSION: The DeLong test indicated that the DTL-based model achieved comparable AUC compared to the classification based on PIRADS v2 score (p = 0.89).


Subject(s)
Deep Learning , Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Prostatic Neoplasms/pathology , Adult , Aged , Aged, 80 and over , Biopsy , Diagnosis, Differential , Humans , Male , Middle Aged , Retrospective Studies , Sensitivity and Specificity , Software
6.
IEEE Trans Image Process ; 27(6): 2806-2817, 2018 Jun.
Article in English | MEDLINE | ID: mdl-29570083

ABSTRACT

We propose novel lossless and lossy compression schemes for color filter array (CFA) sampled images based on the Camera-A ware Multi-Resolution Analysis, or CAMRA. Specifically, by CAMRA we refer to modifications that we make to wavelet transform of CFA sampled images in order to achieve a very high degree of decorrelation at the finest scale wavelet coefficients; and a series of color processing steps applied to the coarse scale wavelet coefficients, aimed at limiting the propagation of lossy compression errors through the subsequent camera processing pipeline. We validated our theoretical analysis and the performance of the proposed compression schemes using the images of natural scenes captured in a raw format. The experimental results verify that our proposed methods improve coding efficiency relative to the standard and the state-of-the-art compression schemes for CFA sampled images.

SELECTION OF CITATIONS
SEARCH DETAIL
...