Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13035-13053, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37186524

RESUMO

Manhattan and Atlanta worlds hold for the structured scenes with only vertical and horizontal dominant directions (DDs). To describe the scenes with additional sloping DDs, a mixture of independent Manhattan worlds seems plausible, but may lead to unaligned and unrelated DDs. By contrast, we propose a novel structural model called Hong Kong world. It is more general than Manhattan and Atlanta worlds since it can represent the environments with slopes, e.g., a city with hilly terrain, a house with sloping roof, and a loft apartment with staircase. Moreover, it is more compact and accurate than a mixture of independent Manhattan worlds by enforcing the orthogonality constraints between not only vertical and horizontal DDs, but also horizontal and sloping DDs. We further leverage the structural regularity of Hong Kong world for the line-based SLAM. Our SLAM method is reliable thanks to three technical novelties. First, we estimate DDs/vanishing points in Hong Kong world in a semi-searching way. We use a new consensus voting strategy for search, instead of traditional branch and bound. This method is the first one that can simultaneously determine the number of DDs, and achieve quasi-global optimality in terms of the number of inliers. Second, we compute the camera pose by exploiting the spatial relations between DDs in Hong Kong world. This method generates concise polynomials, and thus is more accurate and efficient than existing approaches designed for unstructured scenes. Third, we refine the estimated DDs in Hong Kong world by a novel filter-based method. Then we use these refined DDs to optimize the camera poses and 3D lines, leading to higher accuracy and robustness than existing optimization algorithms. In addition, we establish the first dataset of sequential images in Hong Kong world. Experiments showed that our approach outperforms state-of-the-art methods in terms of accuracy and/or efficiency.

2.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 8403-8419, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34428135

RESUMO

We propose a new linear RGB-D simultaneous localization and mapping (SLAM) formulation by utilizing planar features of the structured environments. The key idea is to understand a given structured scene and exploit its structural regularities such as the Manhattan world. This understanding allows us to decouple the camera rotation by tracking structural regularities, which makes SLAM problems free from being highly nonlinear. Additionally, it provides a simple yet effective cue for representing planar features, which leads to a linear SLAM formulation. Given an accurate camera rotation, we jointly estimate the camera translation and planar landmarks in the global planar map using a linear Kalman filter. Our linear SLAM method, called L-SLAM, can understand not only the Manhattan world but the more general scenario of the Atlanta world, which consists of a vertical direction and a set of horizontal directions orthogonal to the vertical direction. To this end, we introduce a novel tracking-by-detection scheme that infers the underlying scene structure by Atlanta representation. With efficient Atlanta representation, we formulate a unified linear SLAM framework for structured environments. We evaluate L-SLAM on a synthetic dataset and RGB-D benchmarks, demonstrating comparable performance to other state-of-the-art SLAM methods without using expensive nonlinear optimization. We assess the accuracy of L-SLAM on a practical application of augmented reality.

3.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 5460-5471, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34057889

RESUMO

Taking selfies has become one of the major photographic trends of our time. In this study, we focus on the selfie stick, on which a camera is mounted to take selfies. We observe that a camera on a selfie stick typically travels through a particular type of trajectory around a sphere. Based on this finding, we propose a robust, efficient, and optimal estimation method for relative camera pose between two images captured by a camera mounted on a selfie stick. We exploit the special geometric structure of camera motion constrained by a selfie stick and define this motion as spherical joint motion. Utilizing a novel parametrization and calibration scheme, we demonstrate that the pose estimation problem can be reduced to a 3-degrees of freedom (DoF) search problem, instead of a generic 6-DoF problem. This facilitates the derivation of an efficient branch-and-bound optimization method that guarantees a global optimal solution, even in the presence of outliers. Furthermore, as a simplified case of spherical joint motion, we introduce selfie motion, which has a fewer number of DoF than spherical joint motion. We validate the performance and guaranteed optimality of our method on both synthetic and real-world data. Additionally, we demonstrate the applicability of the proposed method for two applications: refocusing and stylization.

4.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2656-2669, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30969915

RESUMO

In this work, we describe man-made structures via an appropriate structure assumption, called the Atlanta world assumption, which contains a vertical direction (typically the gravity direction) and a set of horizontal directions orthogonal to the vertical direction. Contrary to the commonly used Manhattan world assumption, the horizontal directions in Atlanta world are not necessarily orthogonal to each other. While Atlanta world can encompass a wider range of scenes, this makes the search space much larger and the problem more challenging. Our input data is a set of surface normals, for example, acquired from RGB-D cameras or 3D laser scanners, as well as lines from calibrated images. Given this input data, we propose the first globally optimal method of inlier set maximization for Atlanta direction estimation. We define a novel search space for Atlanta world, as well as its parametrization, and solve this challenging problem using a branch-and-bound (BnB) framework. To alleviate the computational bottleneck in BnB, i.e., the bound computation, we present two bound computation strategies: rectangular bound and slice bound in an efficient measurement domain, i.e., the extended Gaussian image (EGI). In addition, we propose an efficient two-stage method which automatically estimates the number of horizontal directions of a scene. Experimental results with synthetic and real-world datasets have successfully confirmed the validity of our approach.

5.
IEEE Trans Pattern Anal Mach Intell ; 41(3): 682-696, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29993475

RESUMO

Most man-made environments, such as urban and indoor scenes, consist of a set of parallel and orthogonal planar structures. These structures are approximated by the Manhattan world assumption, in which notion can be represented as a Manhattan frame (MF). Given a set of inputs such as surface normals or vanishing points, we pose an MF estimation problem as a consensus set maximization that maximizes the number of inliers over the rotation search space. Conventionally, this problem can be solved by a branch-and-bound framework, which mathematically guarantees global optimality. However, the computational time of the conventional branch-and-bound algorithms is rather far from real-time. In this paper, we propose a novel bound computation method on an efficient measurement domain for MF estimation, i.e., the extended Gaussian image (EGI). By relaxing the original problem, we can compute the bound with a constant complexity, while preserving global optimality. Furthermore, we quantitatively and qualitatively demonstrate the performance of the proposed method for various synthetic and real-world data. We also show the versatility of our approach through three different applications: extension to multiple MF estimation, 3D rotation based video stabilization, and vanishing point estimation (line clustering).

6.
IEEE Trans Pattern Anal Mach Intell ; 41(4): 775-787, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29993773

RESUMO

Structure from small motion has become an important topic in 3D computer vision as a method for estimating depth, since capturing the input is so user-friendly. However, major limitations exist with respect to the form of depth uncertainty, due to the narrow baseline and the rolling shutter effect. In this paper, we present a dense 3D reconstruction method from small motion clips using commercial hand-held cameras, which typically cause the undesired rolling shutter artifact. To address these problems, we introduce a novel small motion bundle adjustment that effectively compensates for the rolling shutter effect. Moreover, we propose a pipeline for a fine-scale dense 3D reconstruction that models the rolling shutter effect by utilizing both sparse 3D points and the camera trajectory from narrow-baseline images. In this reconstruction, the sparse 3D points are propagated to obtain an initial depth hypothesis using a geometry guidance term. Then, the depth information on each pixel is obtained by sweeping the plane around each depth search space near the hypothesis. The proposed framework shows accurate dense reconstruction results suitable for various sought-after applications. Both qualitative and quantitative evaluations show that our method consistently generates better depth maps compared to state-of-the-art methods.

7.
IEEE Trans Vis Comput Graph ; 22(11): 2395-404, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27479969

RESUMO

One of the most hazardous driving scenario is the overtaking of a slower vehicle, indeed, in this case the front vehicle (being overtaken) can occlude an important part of the field of view of the rear vehicle's driver. This lack of visibility is the most probable cause of accidents in this context. Recent research works tend to prove that augmented reality applied to assisted driving can significantly reduce the risk of accidents. In this paper, we present a real-time marker-less system to see through cars. For this purpose, two cars are equipped with cameras and an appropriate wireless communication system. The stereo vision system mounted on the front car allows to create a sparse 3D map of the environment where the rear car can be localized. Using this inter-car pose estimation, a synthetic image is generated to overcome the occlusion and to create a seamless see-through effect which preserves the structure of the scene.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...