Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-35857731

RESUMO

Convolutional neural networks (CNNs) have come to dominate vision-based deep neural network structures in both image and video models over the past decade. However, convolution-free vision Transformers (ViTs) have recently outperformed CNN-based models in image recognition. Despite this progress, building and designing video Transformers have not yet obtained the same attention in research as image-based Transformers. While there have been attempts to build video Transformers by adapting image-based Transformers for video understanding, these Transformers still lack efficiency due to the large gap between CNN-based models and Transformers regarding the number of parameters and the training settings. In this work, we propose three techniques to improve video understanding with video Transformers. First, to derive better spatiotemporal feature representation, we propose a new spatiotemporal attention scheme, termed synchronized spatiotemporal and spatial attention (SSTSA), which derives the spatiotemporal features with temporal and spatial multiheaded self-attention (MSA) modules. It also preserves the best spatial attention by another spatial self-attention module in parallel, thereby resulting in an effective Transformer encoder. Second, a motion spotlighting module is proposed to embed the short-term motion of the consecutive input frames to the regular RGB input, which is then processed with a single-stream video Transformer. Third, a simple intraclass frame interlacing method of the input clips is proposed that serves as an effective video augmentation method. Finally, our proposed techniques have been evaluated and validated with a set of extensive experiments in this study. Our video Transformer outperforms its previous counterparts on two well-known datasets, Kinetics400 and Something-Something-v2.

2.
IEEE Trans Image Process ; 17(11): 2053-62, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18854246

RESUMO

Recently, square root 5 -refinement hierarchical sampling has been studied and square root 5-refinement has been used for surface subdivision. Compared with other refinements, such as the dyadic or quincunx refinement, square root 5-refinement has a special property that the nodes in a refined lattice form groups of five nodes with these five nodes having different x and y coordinates. This special property has been shown to be very useful to represent adaptively and render complex and procedural geometry. When square root 5-refinement is used for multiresolution data processing, square root 5-refinement filter banks and wavelets are required. While the construction of 2-D nonseparable (bi)orthogonal wavelets with the dyadic or quincunx refinement has been studied by many researchers, the construction of (bi)orthogonal wavelets with square root 5-refinement has not been investigated. The main goal of this paper is to construct compactly supported orthogonal and biorthogonal wavelets with square root 5 -refinement. In this paper, we obtain block structures of orthogonal and biorthogonal square root 5-refinement FIR filter banks with 4-fold rotational symmetry. We construct compactly supported orthogonal and biorthogonal wavelets based on these block structures.


Assuntos
Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
3.
IEEE Trans Image Process ; 17(9): 1512-21, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18701391

RESUMO

Images are conventionally sampled on a rectangular lattice. Thus, traditional image processing is carried out on the rectangular lattice. The hexagonal lattice was proposed more than four decades ago as an alternative method for sampling. Compared with the rectangular lattice, the hexagonal lattice has certain advantages which include that it needs less sampling points; it has better consistent connectivity and higher symmetry; the hexagonal structure is also pertinent to the vision process. In this paper, we investigate the construction of symmetric FIR hexagonal filter banks for multiresolution hexagonal image processing. We obtain block structures of FIR hexagonal filter banks with 3-fold rotational symmetry and 3-fold axial symmetry. These block structures yield families of orthogonal and biorthogonal FIR hexagonal filter banks with 3-fold rotational symmetry and 3-fold axial symmetry. In this paper, we also discuss the construction of orthogonal and biorthogonal FIR filter banks with scaling functions and wavelets having optimal smoothness. In addition, we present a few of such orthogonal and biorthogonal FIR filters banks.


Assuntos
Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador , Análise de Fourier , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...