Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Neural Netw Learn Syst ; 34(11): 8802-8814, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35254996

RESUMO

Magnetic resonance (MR) imaging plays an important role in clinical and brain exploration. However, limited by factors such as imaging hardware, scanning time, and cost, it is challenging to acquire high-resolution MR images clinically. In this article, fine perceptive generative adversarial networks (FP-GANs) are proposed to produce super-resolution (SR) MR images from the low-resolution counterparts. By adopting the divide-and-conquer scheme, FP-GANs are designed to deal with the low-frequency (LF) and high-frequency (HF) components of MR images separately and parallelly. Specifically, FP-GANs first decompose an MR image into LF global approximation and HF anatomical texture subbands in the wavelet domain. Then, each subband generative adversarial network (GAN) simultaneously concentrates on super-resolving the corresponding subband image. In generator, multiple residual-in-residual dense blocks are introduced for better feature extraction. In addition, the texture-enhancing module is designed to trade off the weight between global topology and detailed textures. Finally, the reconstruction of the whole image is considered by integrating inverse discrete wavelet transformation in FP-GANs. Comprehensive experiments on the MultiRes_7T and ADNI datasets demonstrate that the proposed model achieves finer structure recovery and outperforms the competing methods quantitatively and qualitatively. Moreover, FP-GANs further show the value by applying the SR results in classification tasks.

2.
Artigo em Inglês | MEDLINE | ID: mdl-35857731

RESUMO

Convolutional neural networks (CNNs) have come to dominate vision-based deep neural network structures in both image and video models over the past decade. However, convolution-free vision Transformers (ViTs) have recently outperformed CNN-based models in image recognition. Despite this progress, building and designing video Transformers have not yet obtained the same attention in research as image-based Transformers. While there have been attempts to build video Transformers by adapting image-based Transformers for video understanding, these Transformers still lack efficiency due to the large gap between CNN-based models and Transformers regarding the number of parameters and the training settings. In this work, we propose three techniques to improve video understanding with video Transformers. First, to derive better spatiotemporal feature representation, we propose a new spatiotemporal attention scheme, termed synchronized spatiotemporal and spatial attention (SSTSA), which derives the spatiotemporal features with temporal and spatial multiheaded self-attention (MSA) modules. It also preserves the best spatial attention by another spatial self-attention module in parallel, thereby resulting in an effective Transformer encoder. Second, a motion spotlighting module is proposed to embed the short-term motion of the consecutive input frames to the regular RGB input, which is then processed with a single-stream video Transformer. Third, a simple intraclass frame interlacing method of the input clips is proposed that serves as an effective video augmentation method. Finally, our proposed techniques have been evaluated and validated with a set of extensive experiments in this study. Our video Transformer outperforms its previous counterparts on two well-known datasets, Kinetics400 and Something-Something-v2.

3.
IEEE Trans Neural Netw Learn Syst ; 33(1): 229-243, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33064653

RESUMO

The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. Most of the recent theoretical studies for deep learning focus on the necessity and advantages of depth and structures of neural networks. In this article, we aim at rigorous verification of the importance of massive data in embodying the outperformance of deep learning. In particular, we prove that the massiveness of data is necessary for realizing the spatial sparseness, and deep nets are crucial tools to make full use of massive data in such an application. All these findings present the reasons why deep learning achieves great success in the era of big data though deep nets and numerous network structures have been proposed at least 20 years ago.

4.
Artigo em Inglês | MEDLINE | ID: mdl-33566766

RESUMO

In the mathematical and engineering literature on signal processing and time-series analysis, there are two opposite points of view concerning the extraction of time-varying frequencies (commonly called instantaneous frequencies, IFs). One is to consider the given signal as a composite signal consisting of a finite number of subsignals that are oscillating, and the goal is to decompose the signal into the sum of the (unknown) subsignals, followed by extracting the IF from each subsignal; the other is first to extract from the given signal, the IFs of the (unknown) subsignals, from which the subsignals that constitute the given signal are recovered. Let us call the first the ``signal decomposition approach'' and the second the ``signal resolution approach.'' For the ``signal decomposition approach,'' rigorous mathematical theories on function decomposition have been well developed in the mathematical literature, with the most relevant one, called ``atomic decomposition'' initiated by R. Coifman, with various extensions by others, notably by D. Donoho, with the goal of extracting the signal building blocks, but without concern of which building blocks constitute any of the subsignals, and consequently, the subsignals along with their IFs cannot be recovered. On the other hand, the most popular of the decomposition approach is the ``empirical mode decomposition (EMD),'' proposed by N. Huang et al., with many variations by others. In contrast to atomic decomposition, all variations of EMD are ad hoc algorithms, without any rigorous mathematical theory. Unfortunately, all existing versions of EMD fail to resolve the inverse problem on the recovery of the subsignals that constitute the given composite signal, and consequently, extracting the IFs is not satisfactory. For example, EMD fails to extract even two IFs that are not far apart from each other. In contrast to the signal decomposition approach, the ``signal resolution approach'' has a very long history dated back to the Prony method, introduced by G. de Prony in 1795, for solving the inverse problem of time-invariant linear systems. On the other hand, for nonstationary signals, the synchrosqueezed wavelet transform (SST), proposed by I. Daubechies over a decade ago, with various extensions and variations by others, was introduced to resolving the inverse problem, by first extracting the IFs from some reference frequency, followed by recovering the subsignals. Unfortunately, the SST approximate IFs could not be separated when the target IFs are close to one another at certain time instants, and even if they could be separated, the approximation is usually not sufficiently accurate. For these reasons, some signal components could not be recovered, and those that could be recovered are usually inexact. More recently, we introduced and developed a more direct method, called signal separation operation (SSO), published in 2016, to accurately compute the IFs and to accurately recover all signal components even if some of the target IFs are close to each other. The main contributions of this article are twofold. First, the SSO method is extended from uniformly sampled data to arbitrarily sampled data. This method is localized as illustrated by a number of numerical examples, including components with different subsignal arrival and departure times. It also yields a short-term prediction of the digital components along with their IFs. Second, we present a novel theory-inspired implementation of our method as a deep neural network (DNN). We have proved that a major advantage of DNN over shallow networks is that DNN can take advantage of any inherent compositional structure in the target function, while shallow networks are necessarily blind to such structure. Therefore, DNN can avoid the so-called curse of dimensionality using what we have called the blessing of compositionality. However, the compositional structure of the target function is not uniquely defined, and the constituent functions are typically not known so that the networks still need to be trained end-to-end. In contrast, the DNN introduced in this article implements a mathematical procedure so that no training is required at all, and the compositional structure is evident from the procedure. We will disclose the extension of the SSO method in Sections II and III and explain the construction of the deep network in Section IV.

5.
IEEE Trans Vis Comput Graph ; 15(1): 62-76, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19008556

RESUMO

We present a non-photorealistic rendering technique that automatically delivers a stylized abstraction of a photograph. Our approach is based on shape/color filtering guided by a vector field that describes the flow of salient features in the image. This flow-based filtering significantly improves the abstraction performance in terms of feature enhancement and stylization. Our method is simple, fast, and easy to implement. Experimental results demonstrate the effectiveness of our method in producing stylistic and feature-enhancing illustrations from photographs.


Assuntos
Algoritmos , Gráficos por Computador , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reologia/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...