Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-29994635

RESUMO

During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.

2.
IEEE Trans Image Process ; 26(11): 5270-5283, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28767369

RESUMO

Correlation filters have been successfully used in visual tracking due to their modeling power and computational efficiency. However, the state-of-the-art correlation filter-based (CFB) tracking algorithms tend to quickly discard the previous poses of the target, since they consider only a single filter in their models. On the contrary, our approach is to register multiple CFB trackers for previous poses and exploit the registered knowledge when an appearance change occurs. To this end, we propose a novel tracking algorithm [of complexity O(D) ] based on a large ensemble of CFB trackers. The ensemble [of size O(2D) ] is organized over a binary tree (depth D ), and learns the target appearance subspaces such that each constituent tracker becomes an expert of a certain appearance. During tracking, the proposed algorithm combines only the appearance-aware relevant experts to produce boosted tracking decisions. Additionally, we propose a versatile spatial windowing technique to enhance the individual expert trackers. For this purpose, spatial windows are learned for target objects as well as the correlation filters and then the windowed regions are processed for more robust correlations. In our extensive experiments on benchmark datasets, we achieve a substantial performance increase by using the proposed tracking algorithm together with the spatial windowing.

3.
IEEE Trans Image Process ; 23(12): 5222-32, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25265630

RESUMO

The depth modality of the multiview video plus depth (MVD) format is an active research area, whose main objective is to develop depth image based rendering friendly efficient compression methods. As a part of this research, a novel 3D planar-based depth representation is proposed. The planar approximation of multiple depth images are formulated as an energy-based co-segmentation problem by a Markov random field model. The energy terms of this problem are designed to mimic the rate-distortion tradeoff for a depth compression application. A novel algorithm is developed for practical utilization of the proposed planar approximations in stereo depth compression. The co-segmented regions are also represented as layered planar structures forming a novel single-reference MVD format. The ability of the proposed layered planar MVD representation in decoupling the texture and geometric distortions make it a promising approach. Proposed 3D planar depth compression approaches are compared against the state-of-the-art image/video coding standards by objective and visual evaluation and yielded competitive performance.

4.
IEEE Trans Image Process ; 19(7): 1785-97, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20215072

RESUMO

With the advances in image based rendering (IBR) in recent years, generation of a realistic arbitrary view of a scene from a number of original views has become cheaper and faster. One of the main applications of this progress has emerged as free-view TV(FTV), where TV-viewers select freely the viewing position and angle via IBR on the transmitted multiview video. Noting that the TV-viewer might record a personal video for this arbitrarily selected view and misuse this content, it is apparent that copyright and copy protection problems also exist and should be solved for FTV. In this paper, we focus on this newly emerged problem by proposing a watermarking method for free-view video. The watermark is embedded into every frame of multiple views by exploiting the spatial masking properties of the human visual system. Assuming that the position and rotation of the virtual camera is known, the proposed method extracts the watermark successfully from an arbitrarily generated virtual image. In order to extend the method for the case of an unknown virtual camera position and rotation, the transformations on the watermark pattern due to image based rendering operations are analyzed. Based upon this analysis, camera position and homography estimation methods are proposed for the virtual camera. The encouraging simulation results promise not only a novel method, but also a new direction for watermarking research.

5.
IEEE Trans Image Process ; 18(3): 483-94, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19171517

RESUMO

In any practical application of the 2-D-to-3-D conversion that involves storage and transmission, representation efficiency has an undisputable importance that is not reflected in the attention the topic received. In order to address this problem, a novel algorithm, which yields efficient 3-D representations in the rate distortion sense, is proposed. The algorithm utilizes two views of a scene to build a mesh-based representation incrementally, via adding new vertices, while minimizing a distortion measure. The experimental results indicate that, in scenes that can be approximated by planes, the proposed algorithm is superior to the dense depth map and, in some practical situations, to the block motion vector-based representations in the rate-distortion sense.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Artefatos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...