Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Article in English | MEDLINE | ID: mdl-38349822

ABSTRACT

Blind image restoration (IR) is a common yet challenging problem in computer vision. Classical model-based methods and recent deep learning (DL)-based methods represent two different methodologies for this problem, each with its own merits and drawbacks. In this paper, we propose a novel blind image restoration method, aiming to integrate both the advantages of them. Specifically, we construct a general Bayesian generative model for the blind IR, which explicitly depicts the degradation process. In this proposed model, a pixel-wise non-i.i.d. Gaussian distribution is employed to fit the image noise. It is with more flexibility than the simple i.i.d. Gaussian or Laplacian distributions as adopted in most of conventional methods, so as to handle more complicated noise types contained in the image degradation. To solve the model, we design a variational inference algorithm where all the expected posteriori distributions are parameterized as deep neural networks to increase their model capability. Notably, such an inference algorithm induces a unified framework to jointly deal with the tasks of degradation estimation and image restoration. Further, the degradation information estimated in the former task is utilized to guide the latter IR process. Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts. The source code is available at https://github.com/zsyOAOA/VIRNet.

2.
EClinicalMedicine ; 61: 102050, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37425371

ABSTRACT

Background: Adolescent idiopathic scoliosis (AIS) is the most common type of spinal disorder affecting children. Clinical screening and diagnosis require physical and radiographic examinations, which are either subjective or increase radiation exposure. We therefore developed and validated a radiation-free portable system and device utilising light-based depth sensing and deep learning technologies to analyse AIS by landmark detection and image synthesis. Methods: Consecutive patients with AIS attending two local scoliosis clinics in Hong Kong between October 9, 2019, and May 21, 2022, were recruited. Patients were excluded if they had psychological and/or systematic neural disorders that could influence the compliance of the study and/or the mobility of the patients. For each participant, a Red Green Blue-Depth (RGBD) image of the nude back was collected using our in-house radiation-free device. Manually labelled landmarks and alignment parameters by our spine surgeons were considered as the ground truth (GT). Images from training and internal validation cohorts (n = 1936) were used to develop the deep learning models. The model was then prospectively validated on another cohort (n = 302) which was collected in Hong Kong and had the same demographic properties as the training cohort. We evaluated the prediction accuracy of the model on nude back landmark detection as well as the performance on radiograph-comparable image (RCI) synthesis. The obtained RCIs contain sufficient anatomical information that can quantify disease severities and curve types. Findings: Our model had a consistently high accuracy in predicting the nude back anatomical landmarks with a less than 4-pixel error regarding the mean Euclidian and Manhattan distance. The synthesized RCI for AIS severity classification achieved a sensitivity and negative predictive value of over 0.909 and 0.933, and the performance for curve type classification was 0.974 and 0.908, with spine specialists' manual assessment results on real radiographs as GT. The estimated Cobb angle from synthesized RCIs had a strong correlation with the GT angles (R2 = 0.984, p < 0.001). Interpretation: The radiation-free medical device powered by depth sensing and deep learning techniques can provide instantaneous and harmless spine alignment analysis which has the potential for integration into routine screening for adolescents. Funding: Innovation and Technology Fund (MRP/038/20X), Health Services Research Fund (HMRF) 08192266.

3.
Article in English | MEDLINE | ID: mdl-37022429

ABSTRACT

This paper addresses the problem of face video inpainting. Existing video inpainting methods target primarily at natural scenes with repetitive patterns. They do not make use of any prior knowledge of the face to help retrieve correspondences for the corrupted face. They therefore only achieve sub-optimal results, particularly for faces under large pose and expression variations where face components appear very differently across frames. In this paper, we propose a two-stage deep learning method for face video inpainting. We employ 3DMM as our 3D face prior to transform a face between the image space and the UV (texture) space. In Stage I, we perform face inpainting in the UV space. This helps to largely remove the influence of face poses and expressions and makes the learning task much easier with well aligned face features. We introduce a frame-wise attention module to fully exploit correspondences in neighboring frames to assist the inpainting task. In Stage II, we transform the inpainted face regions back to the image space and perform face video refinement that inpaints any background regions not covered in Stage I and also refines the inpainted face regions. Extensive experiments have been carried out which show our method can significantly outperform methods based merely on 2D information, especially for faces under large pose and expression variations. Project page: https://ywq.github.io/FVIP.

4.
Comput Med Imaging Graph ; 99: 102091, 2022 07.
Article in English | MEDLINE | ID: mdl-35803034

ABSTRACT

Most learning-based magnetic resonance image (MRI) segmentation methods rely on the manual annotation to provide supervision, which is extremely tedious, especially when multiple anatomical structures are required. In this work, we aim to develop a hybrid framework named Spine-GFlow that combines the image features learned by a CNN model and anatomical priors for multi-tissue segmentation in a sagittal lumbar MRI. Our framework does not require any manual annotation and is robust against image feature variation caused by different image settings and/or underlying pathology. Our contributions include: 1) a rule-based method that automatically generates the weak annotation (initial seed area), 2) a novel proposal generation method that integrates the multi-scale image features and anatomical prior, 3) a comprehensive loss for CNN training that optimizes the pixel classification and feature distribution simultaneously. Our Spine-GFlow has been validated on 2 independent datasets: HKDDC (containing images obtained from 3 different machines) and IVDM3Seg. The segmentation results of vertebral bodies (VB), intervertebral discs (IVD), and spinal canal (SC) are evaluated quantitatively using intersection over union (IoU) and the Dice coefficient. Results show that our method, without requiring manual annotation, has achieved a segmentation performance comparable to a model trained with full supervision (mean Dice 0.914 vs 0.916).


Subject(s)
Intervertebral Disc , Magnetic Resonance Imaging , Image Processing, Computer-Assisted/methods , Intervertebral Disc/diagnostic imaging , Intervertebral Disc/pathology , Lumbosacral Region , Magnetic Resonance Imaging/methods
5.
EClinicalMedicine ; 43: 101252, 2022 Jan.
Article in English | MEDLINE | ID: mdl-35028544

ABSTRACT

BACKGROUND: Assessment of spine alignment is crucial in the management of scoliosis, but current auto-analysis of spine alignment suffers from low accuracy. We aim to develop and validate a hybrid model named SpineHRNet+, which integrates artificial intelligence (AI) and rule-based methods to improve auto-alignment reliability and interpretability. METHODS: From December 2019 to November 2020, 1,542 consecutive patients with scoliosis attending two local scoliosis clinics (The Duchess of Kent Children's Hospital at Sandy Bay in Hong Kong; Queen Mary Hospital in Pok Fu Lam on Hong Kong Island) were recruited. The biplanar radiographs of each patient were collected with our medical machine EOS™. The collected radiographs were recaptured using smartphones or screenshots, with deidentified images securely stored. Manually labelled landmarks and alignment parameters by a spine surgeon were considered as ground truth (GT). The data were split 8:2 to train and internally test SpineHRNet+, respectively. This was followed by a prospective validation on another 337 patients. Quantitative analyses of landmark predictions were conducted, and reliabilities of auto-alignment were assessed using linear regression and Bland-Altman plots. Deformity severity and sagittal abnormality classifications were evaluated by confusion matrices. FINDINGS: SpineHRNet+ achieved accurate landmark detection with mean Euclidean distance errors of 2·78 and 5·52 pixels on posteroanterior and lateral radiographs, respectively. The mean angle errors between predictions and GT were 3·18° and 6·32° coronally and sagittally. All predicted alignments were strongly correlated with GT (p < 0·001, R2 > 0·97), with minimal overall difference visualised via Bland-Altman plots. For curve detections, 95·7% sensitivity and 88·1% specificity was achieved, and for severity classification, 88·6-90·8% sensitivity was obtained. For sagittal abnormalities, greater than 85·2-88·9% specificity and sensitivity were achieved. INTERPRETATION: The auto-analysis provided by SpineHRNet+ was reliable and continuous and it might offer the potential to assist clinical work and facilitate large-scale clinical studies. FUNDING: RGC Research Impact Fund (R5017-18F), Innovation and Technology Fund (ITS/404/18), and the AOSpine East Asia Fund (AOSEA(R)2019-06).

6.
IEEE Trans Pattern Anal Mach Intell ; 44(1): 129-142, 2022 Jan.
Article in English | MEDLINE | ID: mdl-32750798

ABSTRACT

This paper addresses the problem of photometric stereo, in both calibrated and uncalibrated scenarios, for non-Lambertian surfaces based on deep learning. We first introduce a fully convolutional deep network for calibrated photometric stereo, which we call PS-FCN. Unlike traditional approaches that adopt simplified reflectance models to make the problem tractable, our method directly learns the mapping from reflectance observations to surface normal, and is able to handle surfaces with general and unknown isotropic reflectance. At test time, PS-FCN takes an arbitrary number of images and their associated light directions as input and predicts a surface normal map of the scene in a fast feed-forward pass. To deal with the uncalibrated scenario where light directions are unknown, we introduce a new convolutional network, named LCNet, to estimate light directions from input images. The estimated light directions and the input images are then fed to PS-FCN to determine the surface normals. Our method does not require a pre-defined set of light directions and can handle multiple images in an order-agnostic manner. Thorough evaluation of our approach on both synthetic and real datasets shows that it outperforms state-of-the-art methods in both calibrated and uncalibrated scenarios.

7.
IEEE Trans Image Process ; 30: 2141-2154, 2021.
Article in English | MEDLINE | ID: mdl-33439840

ABSTRACT

This paper addresses the problem of mirror surface reconstruction, and proposes a solution based on observing the reflections of a moving reference plane on the mirror surface. Unlike previous approaches which require tedious calibration, our method can recover the camera intrinsics, the poses of the reference plane, as well as the mirror surface from the observed reflections of the reference plane under at least three unknown distinct poses. We first show that the 3D poses of the reference plane can be estimated from the reflection correspondences established between the images and the reference plane. We then form a bunch of 3D lines from the reflection correspondences, and derive an analytical solution to recover the line projection matrix. We transform the line projection matrix to its equivalent camera projection matrix, and propose a cross-ratio based formulation to optimize the camera projection matrix by minimizing reprojection errors. The mirror surface is then reconstructed based on the optimized cross-ratio constraint. Experimental results on both synthetic and real data are presented, which demonstrate the feasibility and accuracy of our method.

8.
IEEE Trans Image Process ; 30: 1219-1231, 2021.
Article in English | MEDLINE | ID: mdl-33315560

ABSTRACT

General image super-resolution techniques have difficulties in recovering detailed face structures when applying to low resolution face images. Recent deep learning based methods tailored for face images have achieved improved performance by jointly trained with additional task such as face parsing and landmark prediction. However, multi-task learning requires extra manually labeled data. Besides, most of the existing works can only generate relatively low resolution face images (e.g., 128×128 ), and their applications are therefore limited. In this paper, we introduce a novel SPatial Attention Residual Network (SPARNet) built on our newly proposed Face Attention Units (FAUs) for face super-resolution. Specifically, we introduce a spatial attention mechanism to the vanilla residual blocks. This enables the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. This makes the training more effective and efficient as the key face structures only account for a very small portion of the face image. Visualization of the attention maps shows that our spatial attention network can capture the key face structures well even for very low resolution faces (e.g., 16×16 ). Quantitative comparisons on various kinds of metrics (including PSNR, SSIM, identity similarity, and landmark detection) demonstrate the superiority of our method over current state-of-the-arts. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512×512 ). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images. Codes are available at https://github.com/chaofengc/Face-SPARNet.

9.
IEEE Trans Image Process ; 26(12): 5994-6005, 2017 Dec.
Article in English | MEDLINE | ID: mdl-28910764

ABSTRACT

Fully connected Markov random fields and conditional random fields have recently been shown to be advantageous in many early vision tasks being formulated as multi-labeling problems, such as stereo matching and image segmentation. The maximum posterior marginal (MPM) inference method in solving fully connected models uses a hybrid framework of mean-field (MF) method and a filtering like approach, and yields excellent results. In this paper, we extend this framework in several aspects. First, we provide an alternative inference method employing fractional belief propagation based method instead of MF. Second, we reformulate the MPM problem into a maximum a posterior (MAP) problem and provide efficient algorithms for solving this. Third, we extend the fully connected model into a multi-resolution approach. Finally, we propose an integral image based approach which makes it possible for efficiently integrating the local linear regression technique into this framework. Comparisons are carried out among different algorithms and different formulations to find the best combination. We demonstrate that the use of our multi-resolution approach with MAP formulation substantially outperforms the ordinary MF-based inference scheme.

10.
IEEE Comput Graph Appl ; 36(6): 46-56, 2016.
Article in English | MEDLINE | ID: mdl-27244729

ABSTRACT

Using a two-stage algorithm, the proposed technique can tackle the challenges of reconstructing high-quality 3D models of humans wearing regular clothes from sparse uncalibrated cameras. The proposed algorithm based on nonrigid dense correspondences (NRDC) requires fewer images than previous methods because it does not require an initial sparse matching. The authors validated the proposed algorithm using images from an existing dataset and images captured by a cell phone camera.


Subject(s)
Algorithms , Imaging, Three-Dimensional , Humans
11.
IEEE Trans Pattern Anal Mach Intell ; 31(1): 5-14, 2009 Jan.
Article in English | MEDLINE | ID: mdl-19029542

ABSTRACT

This paper addresses the problem of recovering both the intrinsic and extrinsic parameters of a camera from the silhouettes of an object in a turntable sequence. Previous silhouette-based approaches have exploited correspondences induced by epipolar tangents to estimate the image invariants under turntable motion and achieved a weak calibration of the cameras. It is known that the fundamental matrix relating any two views in a turntable sequence can be expressed explicitly in terms of the image invariants, the rotation angle, and a fixed scalar. It will be shown that the imaged circular points for the turntable plane can also be formulated in terms of the same image invariants and fixed scalar. This allows the imaged circular points to be recovered directly from the estimated image invariants, and provide constraints for the estimation of the imaged absolute conic. The camera calibration matrix can thus be recovered. A robust method for estimating the fixed scalar from image triplets is introduced, and a method for recovering the rotation angles using the estimated imaged circular points and epipoles is presented. Using the estimated camera intrinsics and extrinsics, a Euclidean reconstruction can be obtained. Experimental results on real data sequences are presented, which demonstrate the high precision achieved by the proposed method.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Models, Theoretical , Pattern Recognition, Automated/methods , Video Recording/methods , Calibration , Computer Simulation , Image Enhancement/methods , Video Recording/standards
12.
IEEE Trans Pattern Anal Mach Intell ; 30(12): 2243-8, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18988956

ABSTRACT

This paper proposes a novel method for robustly recovering the camera geometry of an uncalibrated image sequence taken under circular motion. Under circular motion, all the camera centers lie on a circle and the mapping from the plane containing this circle to the horizon line observed in the image can be modelled as a 1D projection. A 2 x 2 homography is introduced in this paper to relate the projections of the camera centers in two 1D views. It is shown that the two imaged circular points of the motion plane and the rotation angle between the two views can be derived directly from such a homography. This way of recovering the imaged circular points and rotation angles is intrinsically a multiple view approach, as all the sequence geometry embedded in the epipoles is exploited in the estimation of the homography for each view pair. This results in a more robust method compared to those computing the rotation angles using adjacent views only. The proposed method has been applied to self-calibrate turntable sequences using either point features or silhouettes, and highly accurate results have been achieved.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Calibration , Image Enhancement/standards , Image Interpretation, Computer-Assisted/standards , Motion , Reproducibility of Results , Sensitivity and Specificity
13.
Article in English | MEDLINE | ID: mdl-18003387

ABSTRACT

Multi-slice Computed Tomography (MSCT) is an important medical imaging tool that provides dynamic three-dimensional (3D) volume data of the heart for diagnosis of various cardiac diseases. Due to the huge amount of data in MSCT, manual identification, segmentation and tracking of various parts of the heart are very labor intensive and inefficient. In this paper, we introduce a semi-automatic method for robustly segmenting the endocardium surface from cardiac MSCT images. A level set approach is adopted to define a flexible and powerful interface for capturing the complex anatomical structure of the heart. A novel speed function based on clustering the image intensities of the region of interest and the background is proposed for use with the level set method. The method introduced in this paper has the advantages of simple initialization and being capable of segmenting the blood pool with non-homogeneous intensities. Experiments on real data using the proposed speed function have been carried out with 2D, 3D and 4D implementations of the level sets respectively, and comparisons in terms of computational speed and segmentation results are presented.


Subject(s)
Algorithms , Artificial Intelligence , Cluster Analysis , Endocardium/diagnostic imaging , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Tomography, X-Ray Computed/methods , Humans , Imaging, Three-Dimensional/instrumentation , Radiographic Image Enhancement/instrumentation , Radiographic Image Enhancement/methods , Radiographic Image Interpretation, Computer-Assisted/instrumentation , Radiographic Image Interpretation, Computer-Assisted/methods , Sensitivity and Specificity , Tomography, X-Ray Computed/instrumentation
14.
IEEE Trans Pattern Anal Mach Intell ; 29(12): 2205-16, 2007 Dec.
Article in English | MEDLINE | ID: mdl-17934229

ABSTRACT

In this paper, we address the problem of reconstructing an object surface from silhouettes. Previous works by other authors have shown that, based on the principle of duality, surface points can be recovered, theoretically, as the dual to the tangent plane space of the object. In practice, however, the identification of tangent basis in the tangent plane space is not trivial given a set of discretely sampled data. This problem is further complicated by the existence of bi-tangents to the object surface. The key contribution of this paper is the introduction of epipolar parameterization in identifying a well-defined local tangent basis. This extends the applicability of existing dual space reconstruction methods to fairly complicated shapes, without making any explicit assumption on the object topology. We verify our approach with both synthetic and real-world data, and compare it both qualitatively and quantitatively with other popular reconstruction algorithms. Experimental results demonstrate that our proposed approach produces more accurate estimation, whilst maintaining reasonable robustness towards shapes with complex topologies.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Reproducibility of Results , Sensitivity and Specificity
15.
IEEE Trans Pattern Anal Mach Intell ; 29(3): 499-503, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17224619

ABSTRACT

This paper introduces a novel approach for solving the problem of camera calibration from spheres. By exploiting the relationship between the dual images of spheres and the dual image of the absolute conic (IAC), it is shown that the common pole and polar with regard to the conic images of two spheres are also the pole and polar with regard to the IAC. This provides two constraints for estimating the IAC and, hence, allows a camera to be calibrated from an image of at least three spheres. Experimental results show the feasibility of the proposed approach.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/instrumentation , Pattern Recognition, Automated/methods , Photography/instrumentation , Photography/methods , Calibration , Image Enhancement/methods , Imaging, Three-Dimensional/methods , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...