Pesquisa | Portal Regional da BVS

Dual-branch fusion model for lensless imaging.

Zhang, Yinger; Wu, Zhouyi; Xu, Yunhui; Huangfu, Jiangtao.

Opt Express ; 31(12): 19463-19477, 2023 Jun 05.

Artigo em Inglês | MEDLINE | ID: mdl-37381361

RESUMO

A lensless camera is an imaging system that replaces the lens with a mask to reduce thickness, weight, and cost compared to a lensed camera. The improvement of image reconstruction is an important topic in lensless imaging. Model-based approach and pure data-driven deep neural network (DNN) are regarded as two mainstream reconstruction schemes. In this paper, the advantages and disadvantages of these two methods are investigated to propose a parallel dual-branch fusion model. The model-based method and the data-driven method serve as two independent input branches, and the fusion model is used to extract features from the two branches and merge them for better reconstruction. Two types of fusion model named Merger-Fusion-Model and Separate-Fusion-Model are designed for different scenarios, where Separate-Fusion-Model is able to adaptively allocate the weights of the two branches by the attention module. Additionally, we introduce a novel network architecture named UNet-FC into the data-driven branch, which enhances reconstruction by making full use of the multiplexing property of lensless optics. The superiority of the dual-branch fusion model is verified by drawing comparison with other state-of-the-art methods on public dataset (+2.95dB peak signal-to-noise (PSNR), +0.036 structural similarity index (SSIM), -0.0172 Learned Perceptual Image Patch Similarity (LPIPS)). Finally, a lensless camera prototype is constructed to further validate the effectiveness of our method in a real lensless imaging system.

Text detection and recognition based on a lensless imaging system.

Zhang, Yinger; Wu, Zhouyi; Lin, Peiying; Wu, Yuting; Wei, Lusong; Huang, Zhengjie; Huangfu, Jiangtao.

Appl Opt ; 61(14): 4177-4186, 2022 May 10.

Artigo em Inglês | MEDLINE | ID: mdl-36256095

RESUMO

Lensless cameras are characterized by several advantages (e.g., miniaturization, ease of manufacture, and low cost) as compared with conventional cameras. However, they have not been extensively employed due to their poor image clarity and low image resolution, especially for tasks that have high requirements on image quality and details such as text detection and text recognition. To address the problem, a framework of deep-learning-based pipeline structure was built to recognize text with three steps from raw data captured by employing lensless cameras. This pipeline structure consisted of the lensless imaging model U-Net, the text detection model connectionist text proposal network (CTPN), and the text recognition model convolutional recurrent neural network (CRNN). Compared with the method focusing only on image reconstruction, U-Net in the pipeline was able to supplement the imaging details by enhancing factors related to character categories in the reconstruction process, so the textual information can be more effectively detected and recognized by CTPN and CRNN with fewer artifacts and high-clarity reconstructed lensless images. By performing experiments on datasets of different complexities, the applicability to text detection and recognition on lensless cameras was verified. This study reasonably demonstrates text detection and recognition tasks in the lensless camera system, and develops a basic method for novel applications.

Assuntos

Artefatos , Redes Neurais de Computação

Hand gestures recognition in videos taken with a lensless camera.

Zhang, Yinger; Wu, Zhouyi; Lin, Peiying; Pan, Yang; Wu, Yuting; Zhang, Liufang; Huangfu, Jiangtao.

Opt Express ; 30(22): 39520-39533, 2022 Oct 24.

Artigo em Inglês | MEDLINE | ID: mdl-36298902

RESUMO

A lensless camera is an imaging system that uses a mask in place of a lens, making it thinner, lighter, and less expensive than a lensed camera. However, additional complex computation and time are required for image reconstruction. This work proposes a deep learning model named Raw3dNet that recognizes hand gestures directly on raw videos captured by a lensless camera without the need for image restoration. In addition to conserving computational resources, the reconstruction-free method provides privacy protection. Raw3dNet is a novel end-to-end deep neural network model for the recognition of hand gestures in lensless imaging systems. It is created specifically for raw video captured by a lensless camera and has the ability to properly extract and combine temporal and spatial features. The network is composed of two stages: 1. spatial feature extractor (SFE), which enhances the spatial features of each frame prior to temporal convolution; 2. 3D-ResNet, which implements spatial and temporal convolution of video streams. The proposed model achieves 98.59% accuracy on the Cambridge Hand Gesture dataset in the lensless optical experiment, which is comparable to the lensed-camera result. Additionally, the feasibility of physical object recognition is assessed. Further, we show that the recognition can be achieved with respectable accuracy using only a tiny portion of the original raw data, indicating the potential for reducing data traffic in cloud computing scenarios.

Assuntos

Gestos , Reconhecimento Automatizado de Padrão , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Redes Neurais de Computação

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA