Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 41(3): 768-774, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29994452

RESUMO

This paper proposes a Hybrid Approximate Representation (HAR) based on unifying several efficient approximations of the generalized reprojection error (which is known as the gold standard for multiview geometry). The HAR is an over-parameterization scheme where the approximation is applied simultaneously in multiple parameter spaces. A joint minimization scheme "HAR-Descent" can then solve the PnP problem efficiently, while remaining robust to approximation errors and local minima. The technique is evaluated extensively, including numerous synthetic benchmark protocols and the real-world data evaluations used in previous works. The proposed technique was found to have runtime complexity comparable to the fastest O(n) techniques, and up to 10 times faster than current state of the art minimization approaches. In addition, the accuracy exceeds that of all 9 previous techniques tested, providing definitive state of the art performance on the benchmarks, across all 90 of the experiments in the paper and supplementary material, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TPAMI.2018.2806446.

2.
IEEE Trans Image Process ; 26(9): 4378-4388, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28362608

RESUMO

Significant effort has been devoted within the visual tracking community to rapid learning of object properties on the fly. However, state-of-the-art approaches still often fail in cases such as rapid out-of-plane rotation, when the appearance changes suddenly. One of the major contributions of this paper is a radical rethinking of the traditional wisdom of modeling 3D motion as appearance changes during tracking. Instead, 3D motion is modeled as 3D motion. This intuitive but previously unexplored approach provides new possibilities in visual tracking research. First, 3D tracking is more general, as large out-of-plane motion is often fatal for 2D trackers, but helps 3D trackers to build better models. Second, the tracker's internal model of the object can be used in many different applications and it could even become the main motivation, with tracking supporting reconstruction rather than vice versa. This effectively bridges the gap between visual tracking and structure from motion. A new benchmark data set of sequences with extreme out-of-plane rotation is presented and an online leader-board offered to stimulate new research in the relatively underdeveloped area of 3D tracking. The proposed method, provided as a baseline, is capable of successfully tracking these sequences, all of which pose a considerable challenge to 2D trackers (error reduced by 46%).

3.
Int J Comput Vis ; 121(1): 95-110, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-32355409

RESUMO

Action recognition "in the wild" is extremely challenging, particularly when complex 3D actions are projected down to the image plane, losing a great deal of information. The recent growth of 3D data in broadcast content and commercial depth sensors, makes it possible to overcome this. However, there is little work examining the best way to exploit this new modality. In this paper we introduce the Hollywood 3D benchmark, which is the first dataset containing "in the wild" action footage including 3D data. This dataset consists of 650 stereo video clips across 14 action classes, taken from Hollywood movies. We provide stereo calibrations and depth reconstructions for each clip. We also provide an action recognition pipeline, and propose a number of specialised depth-aware techniques including five interest point detectors and three feature descriptors. Extensive tests allow evaluation of different appearance and depth encoding schemes. Our novel techniques exploiting this depth allow us to reach performance levels more than triple those of the best baseline algorithm using only appearance information. The benchmark data, code and calibrations are all made available to the community.

4.
IEEE Trans Image Process ; 25(1): 359-71, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26552087

RESUMO

Long-term tracking of an object, given only a single instance in an initial frame, remains an open problem. We propose a visual tracking algorithm, robust to many of the difficulties that often occur in real-world scenes. Correspondences of edge-based features are used, to overcome the reliance on the texture of the tracked object and improve invariance to lighting. Furthermore, we address long-term stability, enabling the tracker to recover from drift and to provide redetection following object disappearance or occlusion. The two-module principle is similar to the successful state-of-the-art long-term TLD tracker; however, our approach offers better performance in benchmarks and extends to cases of low-textured objects. This becomes obvious in cases of plain objects with no texture at all, where the edge-based approach proves the most beneficial. We perform several different experiments to validate the proposed method. First, results on short-term sequences show the performance of tracking challenging (low textured and/or transparent) objects that represent failure cases for competing the state-of-the-art approaches. Second, long sequences are tracked, including one of almost 30 000 frames, which, to the best of our knowledge, is the longest tracking sequence reported to date. This tests the redetection and drift resistance properties of the tracker. Finally, we report the results of the proposed tracker on the VOT Challenge 2013 and 2014 data sets as well as on the VTB1.0 benchmark, and we show relative performance of the tracker compared with its competitors. All the results are comparable with the state of the art on sequences with textured objects and superior on non-textured objects. The new annotated sequences are made publicly available.

5.
IEEE Trans Pattern Anal Mach Intell ; 36(3): 564-76, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24457511

RESUMO

In this paper, an algorithm is presented for estimating scene flow, which is a richer, 3D analog of optical flow. The approach operates orders of magnitude faster than alternative techniques and is well suited to further performance gains through parallelized implementation. The algorithm employs multiple hypotheses to deal with motion ambiguities, rather than the traditional smoothness constraints, removing oversmoothing errors and providing significant performance improvements on benchmark data, over the previous state of the art. The approach is flexible and capable of operating with any combination of appearance and/or depth sensors, in any setup, simultaneously estimating the structure and motion if necessary. Additionally, the algorithm propagates information over time to resolve ambiguities, rather than performing an isolated estimation at each frame, as in contemporary approaches. Approaches to smoothing the motion field without sacrificing the benefits of multiple hypotheses are explored, and a probabilistic approach to occlusion estimation is demonstrated, leading to 10 and 15 percent improved performance, respectively. Finally, a data-driven tracking approach is described, and used to estimate the 3D trajectories of hands during sign language, without the need to model complex appearance variations at each viewpoint.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...