Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-38648139

ABSTRACT

Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7× faster than that of other state-of-the-art multimodal 3D detection methods. Code is released at https://github.com/BraveGroup/FullySparseFusion.

2.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 4880-4895, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38319774

ABSTRACT

Data association is at the core of many computer vision tasks, e.g., multiple object tracking, image matching, and point cloud registration. however, current data association solutions have some defects: they mostly ignore the intra-view context information; besides, they either train deep association models in an end-to-end way and hardly utilize the advantage of optimization-based assignment methods, or only use an off-the-shelf neural network to extract features. In this paper, we propose a general learnable graph matching method to address these issues. Especially, we model the intra-view relationships as an undirected graph. Then data association turns into a general graph matching problem between graphs. Furthermore, to make optimization end-to-end differentiable, we relax the original graph matching problem into continuous quadratic programming and then incorporate training into a deep graph neural network with KKT conditions and implicit function theorem. In MOT task, our method achieves state-of-the-art performance on several MOT datasets. For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet. For point cloud registration, we also achieve competitive results.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 12490-12505, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37318978

ABSTRACT

As the perception range of LiDAR expands, LiDAR-based 3D object detection contributes ever-increasingly to the long-range perception in autonomous driving. Mainstream 3D object detectors often build dense feature maps, where the cost is quadratic to the perception range, making them hardly scale up to the long-range settings. To enable efficient long-range detection, we first propose a fully sparse object detector termed FSD. FSD is built upon the general sparse voxel encoder and a novel sparse instance recognition (SIR) module. SIR groups the points into instances and applies highly-efficient instance-wise feature extraction. The instance-wise grouping sidesteps the issue of the center feature missing, which hinders the design of the fully sparse architecture. To further enjoy the benefit of fully sparse characteristic, we leverage temporal information to remove data redundancy and propose a super sparse detector named FSD++. FSD++ first generates residual points, which indicate the point changes between consecutive frames. The residual points, along with a few previous foreground points, form the super sparse input data, greatly reducing data redundancy and computational overhead. We comprehensively analyze our method on the large-scale Waymo Open Dataset, and state-of-the-art performance is reported. To showcase the superiority of our method in long-range detection, we also conduct experiments on Argoverse 2 Dataset, where the perception range ([Formula: see text] m) is much larger than Waymo Open Dataset ([Formula: see text] m).

4.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9802-9813, 2022 12.
Article in English | MEDLINE | ID: mdl-34919516

ABSTRACT

Single-View depth estimation using the CNNs trained from unlabelled videos has shown significant promise. However, excellent results have mostly been obtained in street-scene driving scenarios, and such methods often fail in other settings, particularly indoor videos taken by handheld devices. In this work, we establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth. Our fundamental analysis suggests that the rotation behaves as noise during training, as opposed to the translation (baseline) which provides supervision signals. To address the challenge, we propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning. The significantly improved performance validates our motivation. Towards end-to-end learning without requiring pre-processing, we propose an Auto-Rectify Network with novel loss functions, which can automatically learn to rectify images during training. Consequently, our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset. We also demonstrate the generalization of our trained model in ScanNet and Make3D, and the universality of our proposed learning method on 7-Scenes and KITTI datasets.


Subject(s)
Algorithms
5.
IEEE Trans Pattern Anal Mach Intell ; 43(9): 2891-2904, 2021 09.
Article in English | MEDLINE | ID: mdl-32866093

ABSTRACT

Recently neural architecture search (NAS) has raised great interest in both academia and industry. However, it remains challenging because of its huge and non-continuous search space. Instead of applying evolutionary algorithm or reinforcement learning as previous works, this paper proposes a direct sparse optimization NAS (DSO-NAS) method. The motivation behind DSO-NAS is to address the task in the view of model pruning. To achieve this goal, we start from a completely connected block, and then introduce scaling factors to scale the information flow between operations. Next, sparse regularizations are imposed to prune useless connections in the architecture. Lastly, an efficient and theoretically sound optimization method is derived to solve it. Our method enjoys both advantages of differentiability and efficiency, therefore it can be directly applied to large datasets like ImageNet and tasks beyond classification. Particularly, on the CIFAR-10 dataset, DSO-NAS achieves an average test error 2.74 percent, while on the ImageNet dataset DSO-NAS achieves 25.4 percent test error under 600M FLOPs with 8 GPUs in 18 hours. As for semantic segmentation task, DSO-NAS also achieve competitive result compared with manually designed architectures on the PASCAL VOC dataset. Code is available at https://github.com/XinbangZhang/DSO-NAS.

6.
IEEE Trans Pattern Anal Mach Intell ; 36(2): 388-403, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24356357

ABSTRACT

We address the problem of approximate nearest neighbor (ANN) search for visual descriptor indexing. Most spatial partition trees, such as KD trees, VP trees, and so on, follow the hierarchical binary space partitioning framework. The key effort is to design different partition functions (hyperplane or hypersphere) to divide the points so that 1) the data points can be well grouped to support effective NN candidate location and 2) the partition functions can be quickly evaluated to support efficient NN candidate location. We design a trinary-projection direction-based partition function. The trinary-projection direction is defined as a combination of a few coordinate axes with the weights being 1 or -1. We pursue the projection direction using the widely adopted maximum variance criterion to guarantee good space partitioning and find fewer coordinate axes to guarantee efficient partition function evaluation. We present a coordinate-wise enumeration algorithm to find the principal trinary-projection direction. In addition, we provide an extension using multiple randomized trees for improved performance. We justify our approach on large-scale local patch indexing and similar image search.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Subtraction Technique , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
7.
Phys Rev E Stat Nonlin Soft Matter Phys ; 79(3 Pt 2): 036406, 2009 Mar.
Article in English | MEDLINE | ID: mdl-19392063

ABSTRACT

A two-phase model, where the plasma expansion is an isothermal one when laser irradiates and a following adiabatic one after laser ends, has been proposed to predict the maximum energy of the proton beams induced in the ultraintense laser-foil interactions. The hot-electron recirculation in the ultraintense laser-solid interactions has been accounted in and described by the time-dependent hot-electron density continuously in this model. The dilution effect of electron density as electrons recirculate and spread laterally has been considered. With our model, the scaling laws of maximum ion energy have been achieved and the dependence of the scaling coefficients on laser intensity, pulse duration, and target thickness have been obtained. Some interesting results have been predicted: the adiabatic expansion is an important process of the ion acceleration and cannot be neglected; the whole acceleration time is about 10-20 times of laser-pulse duration; the larger the laser intensity, the more sensitive the maximum ion energy to the change of focus radius, and so on.

8.
Phys Rev E Stat Nonlin Soft Matter Phys ; 80(5 Pt 2): 056403, 2009 Nov.
Article in English | MEDLINE | ID: mdl-20365079

ABSTRACT

An analytical expression is proposed to describe the front shape of a non-quasi-neutral plasma expansion with anisotropic electron pressures. It is of significance in the study of ultrashort plasma expansions generated from laser-foil interactions and anisotropic astroplasma expansions in space science. It is found that the plasma front shape depends on the relationship between the ratio of the longitudinal and the transverse temperature of hot electrons kappa;(2) and the electron-ion mass ratio mu . For kappa;(2)(micro,1] , the ion front is a part of an ellipse and the major axis is in the lower-temperature axis. For kappa;(2)< or =micro , the ion front is composed by a part of a hyperbolic and a small pointed projection at the center. In the strongly anisotropic region, there is an ultrashort anomalous plasma emission of tens of femtoseconds at the angle of near 90 degrees . The ion-velocity distribution and angular-energy distribution at the ion front have also been given. Particularly, anomalous positron emissions exist in the electron-positron plasma anisotropic expansion.

9.
Nanotechnology ; 19(40): 405503, 2008 Oct 08.
Article in English | MEDLINE | ID: mdl-21832619

ABSTRACT

Porous zinc ferrite (ZnFe(2)O(4)) nanorods with a diameter of around 50 nm and a length of several micrometers have been synthesized by a microemulsion-based method in combination with calcination at 500 °C. The morphology and structure of the ZnFe(2)O(4) nanorods and its precursor (ZnFe(2)(C(2)O(4))(3) nanorods) were systematically characterized by x-ray powder diffraction, transmission electron microscopy, field emission scanning electron microscopy and high-resolution transmission electron microscopy. The formation mechanism for the porous ZnFe(2)O(4) nanorods is also discussed. Moreover, the porous ZnFe(2)O(4) nanorods were applied in a room-temperature ethanol sensor and exhibited much better sensing performance than ZnFe(2)O(4) nanoparticles.

10.
Guang Pu Xue Yu Guang Pu Fen Xi ; 24(3): 317-9, 2004 Mar.
Article in Chinese | MEDLINE | ID: mdl-15759986

ABSTRACT

With the application of Fourier transform infrared spectrograph and the development of chemical metrology, Fourier Transform Attenuated Total Reflection Infrared spectroscopy (ATR-FTIR) has become a kind of beneficial tool and means for analyzing samples, for which traditional transmission way does not work quite effectively, and for analyzing surface layer structure. ATR-FTIR has been applied to every realm such as spinning and weaving, quality testing, public security and so on. At present, people are launching applied study based on specific property of ATR-FTIR. Therefore, this paper mainly introduced the present study situation and development situation of ATR-FTIR, including theoretical development for depth profiling, qualitative analysis of material surface, quantitative analysis of mixture and solution component, and the application of fiber as ATR appurtenance. In addition, some special fields were also involved, such as inspecting hollow cored fiber's structure and change of orientation, and studying effective mechanism about skin promoter.

SELECTION OF CITATIONS
SEARCH DETAIL
...