Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Main subject
Language
Publication year range
1.
Opt Express ; 31(26): 44113-44126, 2023 Dec 18.
Article in English | MEDLINE | ID: mdl-38178490

ABSTRACT

Passive non-line-of-sight (NLOS) imaging is a promising technique to enhance visual perception for the occluded object hidden behind the wall. Here we present a data-driven NLOS imaging framework by using polarization cue and long-wavelength infrared (LWIR) images. We design a dual-channel input deep neural network to fuse the intensity features from polarized LWIR images and contour features from polarization degree images for NLOS scene reconstruction. To train the model, we create a polarized LWIR NLOS dataset which contains over ten thousand images. The paper demonstrates the passive NLOS imaging experiment in which the hidden people is approximate 6 meters away from the relay wall. It is an exciting finding that even the range is further than that in the prior works. The quantitative evaluation metric of PSNR and SSIM show that our method as an advance over state-of-the-art in passive NLOS imaging.

2.
IEEE Trans Cybern ; 52(4): 2300-2313, 2022 Apr.
Article in English | MEDLINE | ID: mdl-32721905

ABSTRACT

State-of-the-art object detectors usually progressively downsample the input image until it is represented by small feature maps, which loses the spatial information and compromises the representation of small objects. In this article, we propose a context-aware block net (CAB Net) to improve small object detection by building high-resolution and strong semantic feature maps. To internally enhance the representation capacity of feature maps with high spatial resolution, we delicately design the context-aware block (CAB). CAB exploits pyramidal dilated convolutions to incorporate multilevel contextual information without losing the original resolution of feature maps. Then, we assemble CAB to the end of the truncated backbone network (e.g., VGG16) with a relatively small downsampling factor (e.g., 8) and cast off all following layers. CAB Net can capture both basic visual patterns as well as semantical information of small objects, thus improving the performance of small object detection. Experiments conducted on the benchmark Tsinghua-Tencent 100K and the Airport dataset show that CAB Net outperforms other top-performing detectors by a large margin while keeping real-time speed, which demonstrates the effectiveness of CAB Net for small object detection.


Subject(s)
Semantics
3.
IEEE Trans Neural Netw Learn Syst ; 31(8): 2705-2715, 2020 08.
Article in English | MEDLINE | ID: mdl-31562106

ABSTRACT

People in crowd scenes often exhibit the characteristic of imbalanced distribution. On the one hand, people size varies largely due to the camera perspective. People far away from the camera look smaller and are likely to occlude each other, whereas people near to the camera look larger and are relatively sparse. On the other hand, the number of people also varies greatly in the same or different scenes. This article aims to develop a novel model that can accurately estimate the crowd count from a given scene with imbalanced people distribution. To this end, we have proposed an effective multi-level convolutional neural network (MLCNN) architecture that first adaptively learns multi-level density maps and then fuses them to predict the final output. Density map of each level focuses on dealing with people of certain sizes. As a result, the fusion of multi-level density maps is able to tackle the large variation in people size. In addition, we introduce a new loss function named balanced loss (BL) to impose relatively BL feedback during training, which helps further improve the performance of the proposed network. Furthermore, we introduce a new data set including 1111 images with a total of 49 061 head annotations. MLCNN is easy to train with only one end-to-end training stage. Experimental results demonstrate that our MLCNN achieves state-of-the-art performance. In particular, our MLCNN reaches a mean absolute error (MAE) of 242.4 on the UCF_CC_50 data set, which is 37.2 lower than the second-best result.

4.
IEEE Trans Neural Netw Learn Syst ; 29(5): 1587-1597, 2018 05.
Article in English | MEDLINE | ID: mdl-28328517

ABSTRACT

Network in network (NiN) is an effective instance and an important extension of deep convolutional neural network consisting of alternating convolutional layers and pooling layers. Instead of using a linear filter for convolution, NiN utilizes shallow multilayer perceptron (MLP), a nonlinear function, to replace the linear filter. Because of the powerfulness of MLP and convolutions in spatial domain, NiN has stronger ability of feature representation and hence results in better recognition performance. However, MLP itself consists of fully connected layers that give rise to a large number of parameters. In this paper, we propose to replace dense shallow MLP with sparse shallow MLP. One or more layers of the sparse shallow MLP are sparely connected in the channel dimension or channel-spatial domain. The proposed method is implemented by applying unshared convolution across the channel dimension and applying shared convolution across the spatial dimension in some computational layers. The proposed method is called convolution in convolution (CiC). The experimental results on the CIFAR10 data set, augmented CIFAR10 data set, and CIFAR100 data set demonstrate the effectiveness of the proposed CiC method.

5.
IEEE Trans Neural Netw Learn Syst ; 29(7): 2684-2694, 2018 07.
Article in English | MEDLINE | ID: mdl-28504949

ABSTRACT

Conventional convolutional neural networks use either a linear or a nonlinear filter to extract features from an image patch (region) of spatial size (typically, is small and is equal to , e.g., is 5 or 7). Generally, the size of the filter is equal to the size of the input patch. We argue that the representational ability of equal-size strategy is not strong enough. To overcome the drawback, we propose to use subpatch filter whose spatial size is smaller than . The proposed subpatch filter consists of two subsequent filters. The first one is a linear filter of spatial size and is aimed at extracting features from spatial domain. The second one is of spatial size and is used for strengthening the connection between different input feature channels and for reducing the number of parameters. The subpatch filter convolves with the input patch and the resulting network is called a subpatch network. Taking the output of one subpatch network as input, we further repeat constructing subpatch networks until the output contains only one neuron in spatial domain. These subpatch networks form a new network called the cascaded subpatch network (CSNet). The feature layer generated by CSNet is called the csconv layer. For the whole input image, we construct a deep neural network by stacking a sequence of csconv layers. Experimental results on five benchmark data sets demonstrate the effectiveness and compactness of the proposed CSNet. For example, our CSNet reaches a test error of 5.68% on the CIFAR10 data set without model averaging. To the best of our knowledge, this is the best result ever obtained on the CIFAR10 data set.

SELECTION OF CITATIONS
SEARCH DETAIL
...