Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
IEEE Trans Med Imaging ; 43(7): 2657-2669, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38437149

ABSTRACT

The automatic generation of accurate radiology reports is of great clinical importance and has drawn growing research interest. However, it is still a challenging task due to the imbalance between normal and abnormal descriptions and the multi-sentence and multi-topic nature of radiology reports. These features result in significant challenges to generating accurate descriptions for medical images, especially the important abnormal findings. Previous methods to tackle these problems rely heavily on extra manual annotations, which are expensive to acquire. We propose a multi-grained report generation framework incorporating sentence-level image-sentence contrastive learning, which does not require any extra labeling but effectively learns knowledge from the image-report pairs. We first introduce contrastive learning as an auxiliary task for image feature learning. Different from previous contrastive methods, we exploit the multi-topic nature of imaging reports and perform fine-grained contrastive learning by extracting sentence topics and contents and contrasting between sentence contents and refined image contents guided by sentence topics. This forces the model to learn distinct abnormal image features for each specific topic. During generation, we use two decoders to first generate coarse sentence topics and then the fine-grained text of each sentence. We directly supervise the intermediate topics using sentence topics learned by our contrastive objective. This strengthens the generation constraint and enables independent fine-tuning of the decoders using reinforcement learning, which further boosts model performance. Experiments on two large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach outperforms existing state-of-the-art methods, evaluated by both language generation metrics and clinical accuracy.


Subject(s)
Natural Language Processing , Humans , Algorithms , Machine Learning , Radiology Information Systems , Databases, Factual , Radiology/methods
2.
Cell Rep Med ; 4(9): 101164, 2023 09 19.
Article in English | MEDLINE | ID: mdl-37652014

ABSTRACT

Deep learning has yielded promising results for medical image diagnosis but relies heavily on manual image annotations, which are expensive to acquire. We present Cross-DL, a cross-modality learning framework for intracranial abnormality detection and localization in head computed tomography (CT) scans by learning from free-text imaging reports. Cross-DL has a discretizer that automatically extracts discrete labels of abnormality types and locations from reports, which are utilized to train an image analyzer by a dynamic multi-instance learning approach. Benefiting from the low annotation cost and a consequent large-scale training set of 28,472 CT scans, Cross-DL achieves accurate performance, with an average area under the receiver operating characteristic curve (AUROC) of 0.956 (95% confidence interval: 0.952-0.959) in detecting 4 abnormality types in 17 regions while accurately localizing abnormalities at the voxel level. An intracranial hemorrhage classification experiment on the external dataset CQ500 achieves an AUROC of 0.928 (0.905-0.951). The model can also help review prioritization.


Subject(s)
Tomography, X-Ray Computed , Area Under Curve , ROC Curve
3.
Patterns (N Y) ; 2(2): 100197, 2021 Feb 12.
Article in English | MEDLINE | ID: mdl-33659913

ABSTRACT

Intracranial aneurysm (IA) is an enormous threat to human health, which often results in nontraumatic subarachnoid hemorrhage or dismal prognosis. Diagnosing IAs on commonly used computed tomographic angiography (CTA) examinations remains laborious and time consuming, leading to error-prone results in clinical practice, especially for small targets. In this study, we propose a fully automatic deep-learning model for IA segmentation that can be applied to CTA images. Our model, called Global Localization-based IA Network (GLIA-Net), can incorporate the global localization prior and generates the fine-grain three-dimensional segmentation. GLIA-Net is trained and evaluated on a big internal dataset (1,338 scans from six institutions) and two external datasets. Evaluations show that our model exhibits good tolerance to different settings and achieves superior performance to other models. A clinical experiment further demonstrates the clinical utility of our technique, which helps radiologists in the diagnosis of IAs.

4.
IEEE Trans Vis Comput Graph ; 24(4): 1545-1553, 2018 04.
Article in English | MEDLINE | ID: mdl-29543172

ABSTRACT

We propose a novel 360° scene representation for converting real scenes into stereoscopic 3D virtual reality content with head-motion parallax. Our image-based scene representation enables efficient synthesis of novel views with six degrees-of-freedom (6-DoF) by fusing motion fields at two scales: (1) disparity motion fields carry implicit depth information and are robustly estimated from multiple laterally displaced auxiliary viewpoints, and (2) pairwise motion fields enable real-time flow-based blending, which improves the visual fidelity of results by minimizing ghosting and view transition artifacts. Based on our scene representation, we present an end-to-end system that captures real scenes with a robotic camera arm, processes the recorded data, and finally renders the scene in a head-mounted display in real time (more than 40 Hz). Our approach is the first to support head-motion parallax when viewing real 360° scenes. We demonstrate compelling results that illustrate the enhanced visual experience - and hence sense of immersion-achieved with our approach compared to widely-used stereoscopic panoramas.


Subject(s)
Depth Perception/physiology , Head Movements/physiology , Imaging, Three-Dimensional/methods , User-Computer Interface , Virtual Reality , Humans , Video Recording
5.
IEEE Trans Vis Comput Graph ; 23(12): 2586-2598, 2017 12.
Article in English | MEDLINE | ID: mdl-28026772

ABSTRACT

This paper proposes a real-time method for 3D eye performance reconstruction using a single RGBD sensor. Combined with facial surface tracking, our method generates more pleasing facial performance with vivid eye motions. In our method, a novel scheme is proposed to estimate eyeball motions by minimizing the differences between a rendered eyeball and the recorded image. Our method considers and handles different appearances of human irises, lighting variations and highlights on images via the proposed eyeball model and the -based optimization. Robustness and real-time optimization are achieved through the novel 3D Taylor expansion-based linearization. Furthermore, we propose an online bidirectional regression method to handle occlusions and other tracking failures on either of the two eyes from the information of the opposite eye. Experiments demonstrate that our technique achieves robust and accurate eye performance reconstruction for different iris appearances, with various head/face/eye motions, and under different lighting conditions.


Subject(s)
Computer Graphics , Eye/anatomy & histology , Fixation, Ocular/physiology , Imaging, Three-Dimensional/methods , Ocular Physiological Phenomena , Face/physiology , Facial Expression , Humans
6.
IEEE Trans Image Process ; 24(2): 655-66, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25494510

ABSTRACT

Graph cut has proven to be an effective scheme to solve a wide variety of segmentation problems in vision and graphics community. The main limitation of conventional graph-cut implementations is that they can hardly handle large images or videos because of high computational complexity. Even though there are some parallelization solutions, they commonly suffer from the problems of low parallelism (on CPU) or low convergence speed (on GPU). In this paper, we present a novel graph-cut algorithm that leverages a parallelized jump flooding technique and an heuristic push-relabel scheme to enhance the graph-cut process, namely, back-and-forth relabel, convergence detection, and block-wise push-relabel. The entire process is parallelizable on GPU, and outperforms the existing GPU-based implementations in terms of global convergence, information propagation, and performance. We design an intuitive user interface for specifying interested regions in cases of occlusions when handling video sequences. Experiments on a variety of data sets, including images (up to 15 K × 10 K), videos (up to 2.5 K × 1.5 K × 50), and volumetric data, achieve high-quality results and a maximum 40-fold (139-fold) speedup over conventional GPU (CPU-)-based approaches.

7.
IEEE Trans Vis Comput Graph ; 20(10): 1451-60, 2014 Oct.
Article in English | MEDLINE | ID: mdl-26357391

ABSTRACT

We present a novel artistic-verisimilitude driven system for watercolor rendering of images and photos. Our system achieves realistic simulation of a set of important characteristics of watercolor paintings that have not been well implemented before. Specifically, we designed several image filters to achieve: 1) watercolor-specified color transferring; 2) saliency-based level-of-detail drawing; 3) hand tremor effect due to human neural noise; and 4) an artistically controlled wet-in-wet effect in the border regions of different wet pigments. A user study indicates that our method can produce watercolor results of artistic verisimilitude better than previous filter-based or physical-based methods. Furthermore, our algorithm is efficient and can easily be parallelized, making it suitable for interactive image watercolorization.

8.
IEEE Trans Pattern Anal Mach Intell ; 32(2): 231-41, 2010 Feb.
Article in English | MEDLINE | ID: mdl-20075455

ABSTRACT

A wide range of applications in computer intelligence and computer graphics require computing geodesics accurately and efficiently. The fast marching method (FMM) is widely used to solve this problem, of which the complexity is O(N\log N), where N is the total number of nodes on the manifold. A fast sweeping method (FSM) is proposed and applied on arbitrary triangular manifolds of which the complexity is reduced to O(N). By traversing the undigraph, four orderings are built to produce two groups of interfering waves, which cover all directions of characteristics. The correctness of this method is proved by analyzing the coverage of characteristics. The convergence and error estimation are also presented.

9.
IEEE Trans Image Process ; 17(8): 1421-30, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18632350

ABSTRACT

The wavelet transform as an important multiresolution analysis tool has already been commonly applied to texture analysis and classification. Nevertheless, it ignores the structural information while capturing the spectral information of the texture image at different scales. In this paper, we propose a texture analysis and classification approach with the linear regression model based on the wavelet transform. This method is motivated by the observation that there exists a distinctive correlation between the sample images, belonging to the same kind of texture, at different frequency regions obtained by 2-D wavelet packet transform. Experimentally, it was observed that this correlation varies from texture to texture. The linear regression model is employed to analyze this correlation and extract texture features that characterize the samples. Therefore, our method considers not only the frequency regions but also the correlation between these regions. In contrast, the pyramid-structured wavelet transform (PSWT) and the tree-structured wavelet transform (TSWT) do not consider the correlation between different frequency regions. Experiments show that our method significantly improves the texture classification rate in comparison with the multiresolution methods, including PSWT, TSWT, the Gabor transform, and some recently proposed methods derived from these.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Linear Models , Models, Statistical , Regression Analysis , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...