Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Neural Netw ; 157: 216-225, 2023 Jan.
Article in English | MEDLINE | ID: mdl-36347092

ABSTRACT

Mainstream unsupervised domain adaptation (UDA) methods align feature distributions across different domains via adversarial learning. However, most of them focus on global distribution alignment, ignoring the fine-grained domain discrepancy. Besides, they generally require auxiliary models, bringing extra computation costs. To tackle these issues, this study proposes an UDA method that differentiates individual samples without the help of extra models. To this end, we introduce a novel discrepancy metric, termed style discrepancy, to distinguish different target samples. We also propose a paradigm for adversarial style discrepancy minimization (ASDM). Specifically, we fix the parameters of the feature extractor and maximize style discrepancy to update the classifier, which helps detect more hard samples. Adversely, we fix the parameters of the classifier and minimize the style discrepancy to update the feature extractor, pushing those hard samples near the support of the source distribution. Such adversary helps to progressively detect and adapt more hard samples, leading to fine-grained domain adaptation. Experiments on different UDA tasks validate the effectiveness of ASDM. Overall, without any extra models, ASDM reaches a 46.9% mIoU in the GTA5 to Cityscapes benchmark and an 84.7% accuracy in the VisDA-2017 benchmark, outperforming many existing adversarial-learning-based methods.


Subject(s)
Benchmarking , Learning
2.
IEEE Trans Pattern Anal Mach Intell ; 44(4): 2108-2125, 2022 Apr.
Article in English | MEDLINE | ID: mdl-32976095

ABSTRACT

Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images. To exploit 3D cues within stereo images, recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities. However, since disparities can vary significantly for stereo cameras with different baselines, focal lengths and resolutions, the fixed maximum disparity used in cost volume techniques hinders them to handle different stereo image pairs with large disparity variations. In this paper, we propose a generic parallax-attention mechanism (PAM) to capture stereo correspondence regardless of disparity variations. Our PAM integrates epipolar constraints with attention mechanism to calculate feature similarities along the epipolar line to capture stereo correspondence. Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks. Moreover, we introduce a new and large-scale dataset named Flickr1024 for stereo image super-resolution. Experimental results show that our PAM is generic and can effectively learn stereo correspondence under large disparity variations in an unsupervised manner. Comparative results show that our PASMnet and PASSRnet achieve the state-of-the-art performance.

3.
Sensors (Basel) ; 21(4)2021 Feb 18.
Article in English | MEDLINE | ID: mdl-33670686

ABSTRACT

Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5× than PSMNet, 7.5× than CSN and 22.5× than GANet, etc.), demonstrating the effectiveness of our method.

4.
IEEE Trans Pattern Anal Mach Intell ; 43(1): 300-315, 2021 Jan.
Article in English | MEDLINE | ID: mdl-31329107

ABSTRACT

For CNNs based stereo matching methods, cost volumes play an important role in achieving good matching accuracy. In this paper, we present an end-to-end trainable convolution neural network to fully use cost volumes for stereo matching. Our network consists of three sub-modules, i.e., shared feature extraction, initial disparity estimation, and disparity refinement. Cost volumes are calculated at multiple levels using the shared features, and are used in both initial disparity estimation and disparity refinement sub-modules. To improve the efficiency of disparity refinement, multi-scale feature constancy is introduced to measure the correctness of the initial disparity in feature space. These sub-modules of our network are tightly-coupled, making it compact and easy to train. Moreover, we investigate the problem of developing a robust model to perform well across multiple datasets with different characteristics. We achieve this by introducing a two-stage finetuning scheme to gently transfer the model to target datasets. Specifically, in the first stage, the model is finetuned using both a large synthetic dataset and the target datasets with a relatively large learning rate, while in the second stage the model is trained using only the target datasets with a small learning rate. The proposed method is tested on several benchmarks including the Middlebury 2014, KITTI 2015, ETH3D 2017, and SceneFlow datasets. Experimental results show that our method achieves the state-of-the-art performance on all the datasets. The proposed method also won the 1st prize on the Stereo task of Robust Vision Challenge 2018.

5.
Comput Intell Neurosci ; 2017: 4629534, 2017.
Article in English | MEDLINE | ID: mdl-28293256

ABSTRACT

Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.


Subject(s)
Artificial Intelligence , Models, Theoretical , Support Vector Machine , Computational Biology/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...