Pesquisa | Portal Regional da BVS (teste)

DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation.

Liu, Zhengzhe; Dai, Peng; Li, Ruihui; Qi, Xiaojuan; Fu, Chi-Wing.

IEEE Trans Pattern Anal Mach Intell ; 45(12): 14385-14403, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37782580

RESUMO

This paper presents a new text-guided 3D shape generation approach DreamStone that uses images as a stepping stone to bridge the gap between the text and shape modalities for generating 3D shapes without requiring paired text and 3D data. The core of our approach is a two-stage feature-space alignment strategy that leverages a pre-trained single-view reconstruction (SVR) model to map CLIP features to shapes: to begin with, map the CLIP image feature to the detail-rich 3D shape space of the SVR model, then map the CLIP text feature to the 3D shape space through encouraging the CLIP-consistency between the rendered images and the input text. Besides, to extend beyond the generative capability of the SVR model, we design the text-guided 3D shape stylization module that can enhance the output shapes with novel structures and textures. Further, we exploit pre-trained text-to-image diffusion models to enhance the generative diversity, fidelity, and stylization capability. Our approach is generic, flexible, and scalable. It can be easily integrated with various SVR models to expand the generative space and improve the generative fidelity. Extensive experimental results demonstrate that our approach outperforms the state-of-the-art methods in terms of generative quality and consistency with the input text.

Sparse-to-dense coarse-to-fine depth estimation for colonoscopy.

Liu, Ruyu; Liu, Zhengzhe; Lu, Jiaming; Zhang, Guodao; Zuo, Zhigui; Sun, Bo; Zhang, Jianhua; Sheng, Weiguo; Guo, Ran; Zhang, Lejun; Hua, Xiaozhen.

Comput Biol Med ; 160: 106983, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37187133

RESUMO

Colonoscopy, as the golden standard for screening colon cancer and diseases, offers considerable benefits to patients. However, it also imposes challenges on diagnosis and potential surgery due to the narrow observation perspective and limited perception dimension. Dense depth estimation can overcome the above limitations and offer doctors straightforward 3D visual feedback. To this end, we propose a novel sparse-to-dense coarse-to-fine depth estimation solution for colonoscopic scenes based on the direct SLAM algorithm. The highlight of our solution is that we utilize the scattered 3D points obtained from SLAM to generate accurate and dense depth in full resolution. This is done by a deep learning (DL)-based depth completion network and a reconstruction system. The depth completion network effectively extracts texture, geometry, and structure features from sparse depth along with RGB data to recover the dense depth map. The reconstruction system further updates the dense depth map using a photometric error-based optimization and a mesh modeling approach to reconstruct a more accurate 3D model of colons with detailed surface texture. We show the effectiveness and accuracy of our depth estimation method on near photo-realistic challenging colon datasets. Experiments demonstrate that the strategy of sparse-to-dense coarse-to-fine can significantly improve the performance of depth estimation and smoothly fuse direct SLAM and DL-based depth estimation into a complete dense reconstruction system.

Assuntos

Colo , Colonoscopia , Humanos , Colo/diagnóstico por imagem , Algoritmos , Retroalimentação Sensorial

MEN: Mutual Enhancement Networks for Sign Language Recognition and Education.

Liu, Zhengzhe; Pang, Lei; Qi, Xiaojuan.

IEEE Trans Neural Netw Learn Syst ; PP2022 May 25.

Artigo em Inglês | MEDLINE | ID: mdl-35613069

RESUMO

The performance of existing sign language recognition approaches is typically limited by the scale of training data. To address this issue, we propose a mutual enhancement network (MEN) for joint sign language recognition and education. First, a sign language recognition system built upon a spatial-temporal network is proposed to recognize the semantic category of a given sign language video. Besides, a sign language education system is developed to detect the failure modes of learners and further guide them to sign correctly. Our theoretical contribution lies in formulating the above two systems as an estimation-maximization (EM) framework, which can progressively boost each other. The recognition system could become more robust and accurate with more training data collected by the education system, while the education system could guide the learners to sign more precisely, benefiting from the hand shape analysis module of the recognition system. Experimental results on three large-scale sign language recognition datasets validate the superiority of the proposed framework.

GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation.

Qi, Xiaojuan; Liu, Zhengzhe; Liao, Renjie; Torr, Philip H S; Urtasun, Raquel; Jia, Jiaya.

IEEE Trans Pattern Anal Mach Intell ; 44(2): 969-984, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-32870785

RESUMO

In this paper, we propose a geometric neural network with edge-aware refinement (GeoNet++) to jointly predict both depth and surface normal maps from a single image. Building on top of two-stream CNNs, GeoNet++ captures the geometric relationships between depth and surface normals with the proposed depth-to-normal and normal-to-depth modules. In particular, the "depth-to-normal" module exploits the least square solution of estimating surface normals from depth to improve their quality, while the "normal-to-depth" module refines the depth map based on the constraints on surface normals through kernel regression. Boundary information is exploited via an edge-aware refinement module. GeoNet++ effectively predicts depth and surface normals with high 3D consistency and sharp boundaries resulting in better reconstructed 3D scenes. Note that GeoNet++ is generic and can be used in other depth/normal prediction frameworks to improve 3D reconstruction quality and pixel-wise accuracy of depth and surface normals. Furthermore, we propose a new 3D geometric metric (3DGM) for evaluating depth prediction in 3D. In contrast to current metrics that focus on evaluating pixel-wise error/accuracy, 3DGM measures whether the predicted depth can reconstruct high quality 3D surface normals. This is a more natural metric for many 3D application domains. Our experiments on NYUD-V2 [1] and KITTI [2] datasets verify that GeoNet++ produces fine boundary details and the predicted depth can be used to reconstruct high quality 3D surfaces.

Assuntos

Algoritmos , Redes Neurais de Computação , Análise dos Mínimos Quadrados

Volatile organics off-gassed among tobacco-exposed clothing fabrics.

Chien, Yeh-Chung; Chang, Cheng-Ping; Liu, Zheng-Zhe.

J Hazard Mater ; 193: 139-48, 2011 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-21852036

RESUMO

This work evaluates the characteristics of short-term release of volatile and semi-volatile organic chemicals from clothing fabrics that are exposed to environmental tobacco smoke (ETS). Various fabrics were concurrently exposed to ETS in a controlled facility, and the chemicals off-gassed were sampled using solid phase micro-extraction coupled with GC/MS analysis. Toluene-reference concentration (TRC) was calculated for nine selected chemicals and compared. The number of chemicals identified from ETS-exposed fabrics ranged from 13 (polyester and acetate) to 32 (linen). All fabrics off-gassed formaldehyde, tetradecanoic acid and n-hexadecanoic acid, while seven out of eight fabrics emitted furfural, benzonitrile, naphthalene and decanal. Natural fibers of plant origin (cotton and linen) off-gassed higher concentrations (TRC>100 µg/l) of chemicals that have low molecular weight (~100 or less) than did natural fibers of animal origin (wool and silk) and synthetic fibers. Conversely, wool and silk off-gassed more chemicals that are of high molecular weight (>200), such as TDA (TRC>100 µg/l) and n-HDA (TRC>500 µg/l), than did other fabrics. Fabric structure (for a particular material) significantly affects chemical off-gassing. Cotton typically used for polo shirt (knitted) off-gassed significantly (p<0.05) higher TRC for chemicals with molecular weight of ~100 (such as furfural) than did other cottons of woven style. The dyeing of fabric (white vs. black) had a limited effect on emission, while increasing contact time with ETS increased the intensity of chemical emissions. The mean TRC for cotton exposed for 12 min was nearly doubled than those exposed for 8min, but no difference existed for polyester.

Assuntos

Vestuário , Nicotiana , Compostos Orgânicos Voláteis

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA