Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Image Process ; 32: 2160-2173, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37027289

RESUMO

RGB-D saliency detection aims to fuse multi-modal cues to accurately localize salient regions. Existing works often adopt attention modules for feature modeling, with few methods explicitly leveraging fine-grained details to merge with semantic cues. Thus, despite the auxiliary depth information, it is still challenging for existing models to distinguish objects with similar appearances but at distinct camera distances. In this paper, from a new perspective, we propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection. Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies. To realize multi-modal and multi-level fusion, we first use a granularity-based attention scheme to strengthen the discriminatory power of RGB and depth features separately. Then we introduce a unified cross dual-attention module for multi-modal and multi-level fusion in a coarse-to-fine manner. The encoded multi-modal features are gradually aggregated into a shared decoder. Further, we exploit a multi-scale loss to take full advantage of the hierarchical information. Extensive experiments on challenging benchmark datasets demonstrate that our HiDAnet performs favorably over the state-of-the-art methods by large margins. The source code can be found in https://github.com/Zongwei97/HIDANet/.

2.
J Imaging ; 9(2)2023 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-36826948

RESUMO

Omnidirectional images have drawn great research attention recently thanks to their great potential and performance in various computer vision tasks. However, processing such a type of image requires an adaptation to take into account spherical distortions. Therefore, it is not trivial to directly extend the conventional convolutional neural networks on omnidirectional images because CNNs were initially developed for perspective images. In this paper, we present a general method to adapt perspective convolutional networks to equirectangular images, forming a novel distortion-aware convolution. Our proposed solution can be regarded as a replacement for the existing convolutional network without requiring any additional training cost. To verify the generalization of our method, we conduct an analysis on three basic vision tasks, i.e., semantic segmentation, optical flow, and monocular depth. The experiments on both virtual and real outdoor scenarios show our adapted spherical models consistently outperform their counterparts.

3.
J Imaging ; 6(8)2020 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-34460692

RESUMO

The human visual perception uses structural information to recognize stereo correspondences in natural scenes. Therefore, structural information is important to build an efficient stereo matching algorithm. In this paper, we demonstrate that incorporating the structural information similarity, extracted either from image intensity (SSIM) directly or from image gradients (GSSIM), between two patches can accurately describe the patch structures and, thus, provides more reliable initial cost values. We also address one of the major phenomenons faced in stereo matching for real world scenes, radiometric changes. The performance of the proposed cost functions was evaluated within two stages: the first one considers these costs without aggregation process while the second stage uses the fast adaptive aggregation technique. The experiments were conducted on the real road traffic scenes KITTI 2012 and KITTI 2015 benchmarks. The obtained results demonstrate the potential merits of the proposed stereo similarity measurements under radiometric changes.

4.
IEEE Trans Pattern Anal Mach Intell ; 39(2): 327-341, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-27019476

RESUMO

In this paper, we explore the different minimal solutions for egomotion estimation of a camera based on homography knowing the gravity vector between calibrated images. These solutions depend on the prior knowledge about the reference plane used by the homography. We then demonstrate that the number of matched points can vary from two to three and that a direct closed-form solution or a Gröbner basis based solution can be derived according to this plane. Many experimental results on synthetic and real sequences in indoor and outdoor environments show the efficiency and the robustness of our approach compared to standard methods.

5.
IEEE Trans Pattern Anal Mach Intell ; 35(7): 1565-76, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23681987

RESUMO

Data correspondence/grouping under an unknown parametric model is a fundamental topic in computer vision. Finding feature correspondences between two images is probably the most popular application of this research field, and is the main motivation of our work. It is a key ingredient for a wide range of vision tasks, including three-dimensional reconstruction and object recognition. Existing feature correspondence methods are based on either local appearance similarity or global geometric consistency or a combination of both in some heuristic manner. None of these methods is fully satisfactory, especially in the presence of repetitive image textures or mismatches. In this paper, we present a new algorithm that combines the benefits of both appearance-based and geometry-based methods and mathematically guarantees a global optimization. Our algorithm accepts the two sets of features extracted from two images as input, and outputs the feature correspondences with the largest number of inliers, which verify both the appearance similarity and geometric constraints. Specifically, we formulate the problem as a mixed integer program and solve it efficiently by a series of linear programs via a branch-and-bound procedure. We subsequently generalize our framework in the context of data correspondence/grouping under an unknown parametric model and show it can be applied to certain classes of computer vision problems. Our algorithm has been validated successfully on synthesized data and challenging real images.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...