Search | VHL Regional Portal

NDDepth: Normal-Distance Assisted Monocular Depth Estimation and Completion.

Shao, Shuwei; Pei, Zhongcai; Chen, Weihai; Chen, Peter C Y; Li, Zhengguo.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Jun 10.

Article in English | MEDLINE | ID: mdl-38857129

ABSTRACT

Over the past few years, monocular depth estimation and completion have been paid more and more attention from the computer vision community because of their widespread applications. In this paper, we introduce novel physics (geometry)-driven deep learning frameworks for these two tasks by assuming that 3D scenes are constituted with piece-wise planes. Instead of directly estimating the depth map or completing the sparse depth map, we propose to estimate the surface normal and plane-to-origin distance maps or complete the sparse surface normal and distance maps as intermediate outputs. To this end, we develop a normal-distance head that outputs pixel-level surface normal and distance. Afterthat, the surface normal and distance maps are regularized by a developed plane-aware consistency constraint, which are then transformed into depth maps. Furthermore, we integrate an additional depth head to strengthen the robustness of the proposed frameworks. Extensive experiments on the NYU-Depth-v2, KITTI and SUN RGB-D datasets demonstrate that our method exceeds in performance prior state-of-the-art monocular depth estimation and completion competitors.

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.

Shao, Shuwei; Pei, Zhongcai; Chen, Weihai; Zhu, Wentao; Wu, Xingming; Sun, Dianmin; Zhang, Baochang.

Med Image Anal ; 77: 102338, 2022 04.

Article in English | MEDLINE | ID: mdl-35016079

ABSTRACT

Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/ShuweiShao/AF-SfMLearner.

Subject(s)

Ego , Endoscopy, Gastrointestinal , Humans , Motion

A multi-scale unsupervised learning for deformable image registration.

Shao, Shuwei; Pei, Zhongcai; Chen, Weihai; Zhu, Wentao; Wu, Xingming; Zhang, Baochang.

Int J Comput Assist Radiol Surg ; 17(1): 157-166, 2022 Jan.

Article in English | MEDLINE | ID: mdl-34677745

ABSTRACT

PURPOSE: Image registration is a fundamental task in the area of image processing, and it is critical to many clinical applications, e.g., computer-assisted surgery. In this work, we attempt to design an effective framework that gains higher accuracy at a minimal cost of the invertibility of registration field. METHODS: A hierarchically aggregated transformation (HAT) module is proposed. Within each HAT module, we connect multiple convolutions in a hierarchical manner to capture the multi-scale context, enabling small and large displacements between a pair of images to be taken into account simultaneously during the registration process. Besides, an adaptive feature scaling (AFS) mechanism is presented to refine the multi-scale feature maps derived from the HAT module by rescaling channel-wise features in the global receptive field. Based on the HAT module and AFS mechanism, we establish an efficacious and efficient unsupervised deformable registration framework. RESULTS: The devised framework is validated on the dataset of SCARED and MICCAI Instrument Segmentation and Tracking Challenge 2015, and the experimental results demonstrate that our method achieves better registration accuracy with fewer number of folding pixels than three widely used baseline approaches of SyN, NiftyReg and VoxelMorph. CONCLUSION: We develop a novel method for unsupervised deformable image registration by incorporating the HAT module and AFS mechanism into the framework, which provides a new way to obtain a desirable registration field between a pair of images.

Subject(s)

Image Processing, Computer-Assisted , Unsupervised Machine Learning , Algorithms , Humans

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL