Pesquisa | Portal Regional da BVS

Hitchhiker's Guide to Super-Resolution: Introduction and Recent Advances.

Moser, Brian B; Raue, Federico; Frolov, Stanislav; Palacio, Sebastian; Hees, Jorn; Dengel, Andreas.

IEEE Trans Pattern Anal Mach Intell ; 45(8): 9862-9882, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37022895

RESUMO

With the advent of Deep Learning (DL), Super-Resolution (SR) has also become a thriving research area. However, despite promising results, the field still faces challenges that require further research, e.g., allowing flexible upsampling, more effective loss functions, and better evaluation metrics. We review the domain of SR in light of recent advances and examine state-of-the-art models such as diffusion (DDPM) and transformer-based SR models. We critically discuss contemporary strategies used in SR and identify promising yet unexplored research directions. We complement previous surveys by incorporating the latest developments in the field, such as uncertainty-driven losses, wavelet networks, neural architecture search, novel normalization methods, and the latest evaluation techniques. We also include several visualizations for the models and methods throughout each chapter to facilitate a global understanding of the trends in the field. This review ultimately aims at helping researchers to push the boundaries of DL applied to SR.

Adversarial text-to-image synthesis: A review.

Frolov, Stanislav; Hinz, Tobias; Raue, Federico; Hees, Jörn; Dengel, Andreas.

Neural Netw ; 144: 187-209, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34500257

RESUMO

With the advent of generative adversarial networks, synthesizing images from text descriptions has recently become an active research area. It is a flexible and intuitive way for conditional image generation with significant progress in the last years regarding visual realism, diversity, and semantic alignment. However, the field still faces several challenges that require further research efforts such as enabling the generation of high-resolution images with multiple objects, and developing suitable and reliable evaluation metrics that correlate with human judgement. In this review, we contextualize the state of the art of adversarial text-to-image synthesis models, their development since their inception five years ago, and propose a taxonomy based on the level of supervision. We critically examine current strategies to evaluate text-to-image synthesis models, highlight shortcomings, and identify new areas of research, ranging from the development of better datasets and evaluation metrics to possible improvements in architectural design and model training. This review complements previous surveys on generative adversarial networks with a focus on text-to-image synthesis which we believe will help researchers to further advance the field.

Assuntos

Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Humanos , Semântica

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA