Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Med Image Anal ; 97: 103249, 2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-38963972

RESUMO

Image registration is an essential step in many medical image analysis tasks. Traditional methods for image registration are primarily optimization-driven, finding the optimal deformations that maximize the similarity between two images. Recent learning-based methods, trained to directly predict transformations between two images, run much faster, but suffer from performance deficiencies due to domain shift. Here we present a new neural network based image registration framework, called NIR (Neural Image Registration), which is based on optimization but utilizes deep neural networks to model deformations between image pairs. NIR represents the transformation between two images with a continuous function implemented via neural fields, receiving a 3D coordinate as input and outputting the corresponding deformation vector. NIR provides two ways of generating deformation field: directly output a displacement vector field for general deformable registration, or output a velocity vector field and integrate the velocity field to derive the deformation field for diffeomorphic image registration. The optimal registration is discovered by updating the parameters of the neural field via stochastic mini-batch gradient descent. We describe several design choices that facilitate model optimization, including coordinate encoding, sinusoidal activation, coordinate sampling, and intensity sampling. NIR is evaluated on two 3D MR brain scan datasets, demonstrating highly competitive performance in terms of both registration accuracy and regularity. Compared to traditional optimization-based methods, our approach achieves better results in shorter computation times. In addition, our methods exhibit performance on a cross-dataset registration task, compared to the pre-trained learning-based methods.

2.
Nat Commun ; 15(1): 654, 2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38253604

RESUMO

Medical image segmentation is a critical component in clinical practice, facilitating accurate diagnosis, treatment planning, and disease monitoring. However, existing methods, often tailored to specific modalities or disease types, lack generalizability across the diverse spectrum of medical image segmentation tasks. Here we present MedSAM, a foundation model designed for bridging this gap by enabling universal medical image segmentation. The model is developed on a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types. We conduct a comprehensive evaluation on 86 internal validation tasks and 60 external validation tasks, demonstrating better accuracy and robustness than modality-wise specialist models. By delivering accurate and efficient segmentation across a wide spectrum of tasks, MedSAM holds significant potential to expedite the evolution of diagnostic tools and the personalization of treatment plans.


Assuntos
Diagnóstico por Imagem , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos
3.
NPJ Digit Med ; 6(1): 226, 2023 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-38042919

RESUMO

Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Since most neural networks are supervised, their performance heavily depends on the volume and quality of available labels. However, few such labels exist for rare diseases (e.g., new pandemics). Here we report a medical multimodal large language model (Med-MLLM) for radiograph representation learning, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering a rare disease, our Med-MLLM can be rapidly deployed and easily adapted to them with limited labels. Furthermore, our model supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for clinical tasks that involve both visual and textual data. We demonstrate the effectiveness of our Med-MLLM by showing how it would perform using the COVID-19 pandemic "in replay". In the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input data; 2) three kinds of downstream tasks, including disease reporting, diagnosis, and prognosis; 3) five COVID-19 datasets; and 4) three different languages, including English, Chinese, and Spanish. All experiments show that our model can make accurate and robust COVID-19 decision-support with little labelled data.

4.
Phys Med Biol ; 68(20)2023 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-37696272

RESUMO

Objective.Metal artifact reduction (MAR) has been a key issue in CT imaging. Recently, MAR methods based on deep learning have achieved promising results. However, when deploying deep learning-based MAR in real-world clinical scenarios, two prominent challenges arise. One limitation is the lack of paired training data in real applications, which limits the practicality of supervised methods. Another limitation is that image-domain methods suitable for more application scenarios are inadequate in performance while end-to-end approaches with better performance are only applicable to fan-beam CT due to large memory consumption.Approach.We propose a novel image-domain MAR method based on the generative adversarial network with variable constraints (MARGANVAC) to improve MAR performance. The proposed variable constraint is a kind of time-varying cost function that can relax the fidelity constraint at the beginning and gradually strengthen the fidelity constraint as the training progresses. To better deploy our image-domain supervised method into practical scenarios, we develop a transfer method to mimic the real metal artifacts by first extracting the real metal traces and then adding them to artifact-free images to generate paired training data.Main results.The effectiveness of the proposed method is validated in simulated fan-beam experiments and real cone-beam experiments. All quantitative and qualitative results demonstrate that the proposed method achieves superior performance compared with the competing methods.Significance.The MARGANVAC model proposed in this paper is an image-domain model that can be conveniently applied to various scenarios such as fan beam and cone beam CT. At the same time, its performance is on par with the cutting-edge dual-domain MAR approaches. In addition, the metal artifact transfer method proposed in this paper can easily generate paired data with real artifact features, which can be better used for model training in real scenarios.


Assuntos
Artefatos , Tomografia Computadorizada por Raios X , Tomografia Computadorizada por Raios X/métodos , Tomografia Computadorizada de Feixe Cônico , Algoritmos , Metais , Processamento de Imagem Assistida por Computador/métodos , Imagens de Fantasmas
5.
Inf Process Med Imaging ; 13939: 641-653, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37409056

RESUMO

Contrastive learning has shown great promise over annotation scarcity problems in the context of medical image segmentation. Existing approaches typically assume a balanced class distribution for both labeled and unlabeled medical images. However, medical image data in reality is commonly imbalanced (i.e., multi-class label imbalance), which naturally yields blurry contours and usually incorrectly labels rare objects. Moreover, it remains unclear whether all negative samples are equally negative. In this work, we present ACTION, an Anatomical-aware ConTrastive dIstillatiON framework, for semi-supervised medical image segmentation. Specifically, we first develop an iterative contrastive distillation algorithm by softly labeling the negatives rather than binary supervision between positive and negative pairs. We also capture more semantically similar features from the randomly chosen negative set compared to the positives to enforce the diversity of the sampled data. Second, we raise a more important question: Can we really handle imbalanced samples to yield better performance? Hence, the key innovation in ACTION is to learn global semantic relationship across the entire dataset and local anatomical features among the neighbouring pixels with minimal additional memory footprint. During the training, we introduce anatomical contrast by actively sampling a sparse set of hard negative pixels, which can generate smoother segmentation boundaries and more accurate predictions. Extensive experiments across two benchmark datasets and different unlabeled settings show that ACTION significantly outperforms the current state-of-the-art semi-supervised methods.

6.
Invest Radiol ; 58(12): 882-893, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-37493348

RESUMO

OBJECTIVES: The aim of this study was to evaluate the severity of COVID-19 patients' disease by comparing a multiclass lung lesion model to a single-class lung lesion model and radiologists' assessments in chest computed tomography scans. MATERIALS AND METHODS: The proposed method, AssessNet-19, was developed in 2 stages in this retrospective study. Four COVID-19-induced tissue lesions were manually segmented to train a 2D-U-Net network for a multiclass segmentation task followed by extensive extraction of radiomic features from the lung lesions. LASSO regression was used to reduce the feature set, and the XGBoost algorithm was trained to classify disease severity based on the World Health Organization Clinical Progression Scale. The model was evaluated using 2 multicenter cohorts: a development cohort of 145 COVID-19-positive patients from 3 centers to train and test the severity prediction model using manually segmented lung lesions. In addition, an evaluation set of 90 COVID-19-positive patients was collected from 2 centers to evaluate AssessNet-19 in a fully automated fashion. RESULTS: AssessNet-19 achieved an F1-score of 0.76 ± 0.02 for severity classification in the evaluation set, which was superior to the 3 expert thoracic radiologists (F1 = 0.63 ± 0.02) and the single-class lesion segmentation model (F1 = 0.64 ± 0.02). In addition, AssessNet-19 automated multiclass lesion segmentation obtained a mean Dice score of 0.70 for ground-glass opacity, 0.68 for consolidation, 0.65 for pleural effusion, and 0.30 for band-like structures compared with ground truth. Moreover, it achieved a high agreement with radiologists for quantifying disease extent with Cohen κ of 0.94, 0.92, and 0.95. CONCLUSIONS: A novel artificial intelligence multiclass radiomics model including 4 lung lesions to assess disease severity based on the World Health Organization Clinical Progression Scale more accurately determines the severity of COVID-19 patients than a single-class model and radiologists' assessment.


Assuntos
COVID-19 , Humanos , Inteligência Artificial , Estudos Retrospectivos , Pulmão/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Progressão da Doença
7.
Med Phys ; 50(5): 2759-2774, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36718546

RESUMO

BACKGROUND: Many dedicated cone-beam CT (CBCT) systems have irregular scanning trajectories. Compared with the standard CBCT calibration, accurate calibration for CBCT systems with irregular trajectories is a more complex task, since the geometric parameters for each scanning view are variable. Most of the existing calibration methods assume that the intrinsic geometric relationship of the fiducials in the phantom is precisely known, and rarely delve deeper into the issue of whether the phantom accuracy is adapted to the calibration model. PURPOSE: A high-precision phantom and a highly robust calibration model are interdependent and mutually supportive, and they are both important for calibration accuracy, especially for the high-resolution CBCT. Therefore, we propose a calibration scheme that considers both accurate phantom measurement and robust geometric calibration. METHODS: Our proposed scheme consists of two parts: (1) introducing a measurement model to acquire the accurate intrinsic geometric relationship of the fiducials in the phantom; (2) developing a highly noise-robust nonconvex model-based calibration method. The measurement model in the first part is achieved by extending our previous high-precision geometric calibration model suitable for CBCT with circular trajectories. In the second part, a novel iterative method with optimization constraints based on a back-projection model is developed to solve the geometric parameters of each view. RESULTS: The simulations and real experiments show that the measurement errors of the fiducial ball bearings (BBs) are within the subpixel level. With the help of the geometric relationship of the BBs obtained by our measurement method, the classic calibration method can achieve good calibration based on far fewer BBs. All metrics obtained in simulated experiments as well as in real experiments on Micro CT systems with resolutions of 9 and 4.5 µm show that the proposed calibration method has higher calibration accuracy than the competing classic method. It is particularly worth noting that although our measurement model proves to be very accurate, the classic calibration method based on this measurement model can only achieve good calibration results when the resolution of the measurement system is close to that of the system to be calibrated, but our calibration scheme enables high-accuracy calibration even when the resolution of the system to be calibrated is twice that of the measurement system. CONCLUSIONS: The proposed combined geometrical calibration scheme does not rely on a phantom with an intricate pattern of fiducials, so it is applicable in Micro CT with high resolution. The two parts of the scheme, the "measurement model" and the "calibration model," prove to be of high accuracy. The combination of these two models can effectively improve the calibration accuracy, especially in some extreme cases.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Humanos , Calibragem , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada de Feixe Cônico/métodos , Microtomografia por Raio-X , Imagens de Fantasmas
8.
Med Image Comput Comput Assist Interv ; 14229: 710-719, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38174207

RESUMO

Head motion correction is an essential component of brain PET imaging, in which even motion of small magnitude can greatly degrade image quality and introduce artifacts. Building upon previous work, we propose a new head motion correction framework taking fast reconstructions as input. The main characteristics of the proposed method are: (i) the adoption of a high-resolution short-frame fast reconstruction workflow; (ii) the development of a novel encoder for PET data representation extraction; and (iii) the implementation of data augmentation techniques. Ablation studies are conducted to assess the individual contributions of each of these design choices. Furthermore, multi-subject studies are conducted on an 18F-FPEB dataset, and the method performance is qualitatively and quantitatively evaluated by MOLAR reconstruction study and corresponding brain Region of Interest (ROI) Standard Uptake Values (SUV) evaluation. Additionally, we also compared our method with a conventional intensity-based registration method. Our results demonstrate that the proposed method outperforms other methods on all subjects, and can accurately estimate motion for subjects out of the training set. All code is publicly available on GitHub: https://github.com/OnofreyLab/dl-hmc_fast_recon_miccai2023.

9.
Mach Learn Clin Neuroimaging (2023) ; 14312: 34-45, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38174216

RESUMO

Head movement during long scan sessions degrades the quality of reconstruction in positron emission tomography (PET) and introduces artifacts, which limits clinical diagnosis and treatment. Recent deep learning-based motion correction work utilized raw PET list-mode data and hardware motion tracking (HMT) to learn head motion in a supervised manner. However, motion prediction results were not robust to testing subjects outside the training data domain. In this paper, we integrate a cross-attention mechanism into the supervised deep learning network to improve motion correction across test subjects. Specifically, cross-attention learns the spatial correspondence between the reference images and moving images to explicitly focus the model on the most correlative inherent information - the head region the motion correction. We validate our approach on brain PET data from two different scanners: HRRT without time of flight (ToF) and mCT with ToF. Compared with traditional and deep learning benchmarks, our network improved the performance of motion correction by 58% and 26% in translation and rotation, respectively, in multi-subject testing in HRRT studies. In mCT studies, our approach improved performance by 66% and 64% for translation and rotation, respectively. Our results demonstrate that cross-attention has the potential to improve the quality of brain PET image reconstruction without the dependence on HMT. All code will be released on GitHub: https://github.com/OnofreyLab/dl_hmc_attention_mlcn2023.

10.
Adv Neural Inf Process Syst ; 36: 9984-10021, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38813114

RESUMO

For medical image segmentation, contrastive learning is the dominant practice to improve the quality of visual representations by contrasting semantically similar and dissimilar pairs of samples. This is enabled by the observation that without accessing ground truth labels, negative examples with truly dissimilar anatomical features, if sampled, can significantly improve the performance. In reality, however, these samples may come from similar anatomical regions and the models may struggle to distinguish the minority tail-class samples, making the tail classes more prone to misclassification, both of which typically lead to model collapse. In this paper, we propose ARCO, a semi-supervised contrastive learning (CL) framework with stratified group theory for medical image segmentation. In particular, we first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks with extremely limited labels. Furthermore, we theoretically prove these sampling techniques are universal in variance reduction. Finally, we experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings, and our methods consistently outperform state-of-the-art semi-supervised methods. Additionally, we augment the CL frameworks with these sampling techniques and demonstrate significant gains over previous methods. We believe our work is an important step towards semi-supervised medical image segmentation by quantifying the limitation of current self-supervision objectives for accomplishing such challenging safety-critical tasks.

11.
Med Image Comput Comput Assist Interv ; 14223: 194-205, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38813456

RESUMO

Medical data often exhibits long-tail distributions with heavy class imbalance, which naturally leads to difficulty in classifying the minority classes (i.e., boundary regions or rare objects). Recent work has significantly improved semi-supervised medical image segmentation in long-tailed scenarios by equipping them with unsupervised contrastive criteria. However, it remains unclear how well they will perform in the labeled portion of data where class distribution is also highly imbalanced. In this work, we present ACTION++, an improved contrastive learning framework with adaptive anatomical contrast for semi-supervised medical segmentation. Specifically, we propose an adaptive supervised contrastive loss, where we first compute the optimal locations of class centers uniformly distributed on the embedding space (i.e., off-line), and then perform online contrastive matching training by encouraging different class features to adaptively match these distinct and uniformly distributed class centers. Moreover, we argue that blindly adopting a constant temperature τ in the contrastive loss on long-tailed medical data is not optimal, and propose to use a dynamic τ via a simple cosine schedule to yield better separation between majority and minority classes. Empirically, we evaluate ACTION++ on ACDC and LA benchmarks and show that it achieves state-of-the-art across two semi-supervised settings. Theoretically, we analyze the performance of adaptive anatomical contrast and confirm its superiority in label efficiency.

12.
Med Image Comput Comput Assist Interv ; 14222: 561-571, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38840671

RESUMO

Integrating high-level semantically correlated contents and low-level anatomical features is of central importance in medical image segmentation. Towards this end, recent deep learning-based medical segmentation methods have shown great promise in better modeling such information. However, convolution operators for medical segmentation typically operate on regular grids, which inherently blur the high-frequency regions, i.e., boundary regions. In this work, we propose MORSE, a generic implicit neural rendering framework designed at an anatomical level to assist learning in medical image segmentation. Our method is motivated by the fact that implicit neural representation has been shown to be more effective in fitting complex signals and solving computer graphics problems than discrete grid-based representation. The core of our approach is to formulate medical image segmentation as a rendering problem in an end-to-end manner. Specifically, we continuously align the coarse segmentation prediction with the ambiguous coordinate-based point representations and aggregate these features to adaptively refine the boundary region. To parallelly optimize multi-scale pixel-level features, we leverage the idea from Mixture-of-Expert (MoE) to design and train our MORSE with a stochastic gating mechanism. Our experiments demonstrate that MORSE can work well with different medical segmentation backbones, consistently achieving competitive performance improvements in both 2D and 3D supervised medical segmentation methods. We also theoretically analyze the superiority of MORSE.

13.
IEEE Trans Med Imaging ; 41(9): 2228-2237, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35320095

RESUMO

Automated segmentation in medical image analysis is a challenging task that requires a large amount of manually labeled data. However, most existing learning-based approaches usually suffer from limited manually annotated medical data, which poses a major practical problem for accurate and robust medical image segmentation. In addition, most existing semi-supervised approaches are usually not robust compared with the supervised counterparts, and also lack explicit modeling of geometric structure and semantic information, both of which limit the segmentation accuracy. In this work, we present SimCVD, a simple contrastive distillation framework that significantly advances state-of-the-art voxel-wise representation learning. We first describe an unsupervised training strategy, which takes two views of an input volume and predicts their signed distance maps of object boundaries in a contrastive objective, with only two independent dropout as mask. This simple approach works surprisingly well, performing on the same level as previous fully supervised methods with much less labeled data. We hypothesize that dropout can be viewed as a minimal form of data augmentation and makes the network robust to representation collapse. Then, we propose to perform structural distillation by distilling pair-wise similarities. We evaluate SimCVD on two popular datasets: the Left Atrial Segmentation Challenge (LA) and the NIH pancreas CT dataset. The results on the LA dataset demonstrate that, in two types of labeled ratios (i.e., 20% and 10%), SimCVD achieves an average Dice score of 90.85% and 89.03% respectively, a 0.91% and 2.22% improvement compared to previous best results. Our method can be trained in an end-to-end fashion, showing the promise of utilizing SimCVD as a general framework for downstream tasks, such as medical image synthesis, enhancement, and registration.


Assuntos
Destilação , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina Supervisionado , Tomografia Computadorizada por Raios X/métodos
14.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9255-9268, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34855588

RESUMO

Training supervised video captioning model requires coupled video-caption pairs. However, for many targeted languages, sufficient paired data are not available. To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language. To solve the task, a natural choice is to employ a two-step pipeline system: first utilizing video-to-pivot captioning model to generate captions in pivot language and then utilizing pivot-to-target translation model to translate the pivot captions to the target language. However, in such a pipeline system, 1) visual information cannot reach the translation model, generating visual irrelevant target captions; 2) the errors in the generated pivot captions will be propagated to the translation model, resulting in disfluent target captions. To address these problems, we propose the Unpaired Video Captioning with Visual Injection system (UVC-VI). UVC-VI first introduces the Visual Injection Module (VIM), which aligns source visual and target language domains to inject the source visual information into the target language domain. Meanwhile, VIM directly connects the encoder of the video-to-pivot model and the decoder of the pivot-to-target model, allowing end-to-end inference by completely skipping the generation of pivot captions. To enhance the cross-modality injection of the VIM, UVC-VI further introduces a pluggable video encoder, i.e., Multimodal Collaborative Encoder (MCE). The experiments show that UVC-VI outperforms pipeline systems and exceeds several supervised systems. Furthermore, equipping existing supervised systems with our MCE can achieve 4% and 7% relative margins on the CIDEr scores to current state-of-the-art models on the benchmark MSVD and MSR-VTT datasets, respectively.

15.
Artigo em Inglês | MEDLINE | ID: mdl-37415747

RESUMO

Many medical datasets have recently been created for medical image segmentation tasks, and it is natural to question whether we can use them to sequentially train a single model that (1) performs better on all these datasets, and (2) generalizes well and transfers better to the unknown target site domain. Prior works have achieved this goal by jointly training one model on multi-site datasets, which achieve competitive performance on average but such methods rely on the assumption about the availability of all training data, thus limiting its effectiveness in practical deployment. In this paper, we propose a novel multi-site segmentation framework called incremental-transfer learning (ITL), which learns a model from multi-site datasets in an end-to-end sequential fashion. Specifically, "incremental" refers to training sequentially constructed datasets, and "transfer" is achieved by leveraging useful information from the linear combination of embedding features on each dataset. In addition, we introduce our ITL framework, where we train the network including a site-agnostic encoder with pretrained weights and at most two segmentation decoder heads. We also design a novel site-level incremental loss in order to generalize well on the target domain. Second, we show for the first time that leveraging our ITL training scheme is able to alleviate challenging catastrophic forgetting problems in incremental learning. We conduct experiments using five challenging benchmark datasets to validate the effectiveness of our incremental-transfer learning approach. Our approach makes minimal assumptions on computation resources and domain-specific expertise, and hence constitutes a strong starting point in multi-site medical image segmentation.

16.
Med Image Comput Comput Assist Interv ; 13434: 639-652, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37465615

RESUMO

Contrastive learning (CL) aims to learn useful representation without relying on expert annotations in the context of medical image segmentation. Existing approaches mainly contrast a single positive vector (i.e., an augmentation of the same image) against a set of negatives within the entire remainder of the batch by simply mapping all input features into the same constant vector. Despite the impressive empirical performance, those methods have the following shortcomings: (1) it remains a formidable challenge to prevent the collapsing problems to trivial solutions; and (2) we argue that not all voxels within the same image are equally positive since there exist the dissimilar anatomical structures with the same image. In this work, we present a novel Contrastive Voxel-wise Representation Learning (CVRL) method to effectively learn low-level and high-level features by capturing 3D spatial context and rich anatomical information along both the feature and the batch dimensions. Specifically, we first introduce a novel CL strategy to ensure feature diversity promotion among the 3D representation dimensions. We train the framework through bi-level contrastive optimization (i.e., low-level and high-level) on 3D images. Experiments on two benchmark datasets and different labeled settings demonstrate the superiority of our proposed framework. More importantly, we also prove that our method inherits the benefit of hardness-aware property from the standard CL approaches. Codes will be available soon.

17.
Adv Neural Inf Process Syst ; 35: 29582-29596, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37533756

RESUMO

Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: (1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; (2) the models suffer from information loss because they only consider single-scale feature representations; and (3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CASTformer, a novel type of adversarial transformers, for 2D medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CASTformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining 2.54%-5.88% absolute improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CASTformer a strong starting point for downstream medical image analysis tasks.

18.
PLoS Comput Biol ; 16(9): e1008193, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32925919

RESUMO

Segmenting cell nuclei within microscopy images is a ubiquitous task in biological research and clinical applications. Unfortunately, segmenting low-contrast overlapping objects that may be tightly packed is a major bottleneck in standard deep learning-based models. We report a Nuclear Segmentation Tool (NuSeT) based on deep learning that accurately segments nuclei across multiple types of fluorescence imaging data. Using a hybrid network consisting of U-Net and Region Proposal Networks (RPN), followed by a watershed step, we have achieved superior performance in detecting and delineating nuclear boundaries in 2D and 3D images of varying complexities. By using foreground normalization and additional training on synthetic images containing non-cellular artifacts, NuSeT improves nuclear detection and reduces false positives. NuSeT addresses common challenges in nuclear segmentation such as variability in nuclear signal and shape, limited training sample size, and sample preparation artifacts. Compared to other segmentation models, NuSeT consistently fares better in generating accurate segmentation masks and assigning boundaries for touching nuclei.


Assuntos
Núcleo Celular/fisiologia , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos , Algoritmos , Artefatos , Biologia Computacional , Células HeLa , Humanos , Software
19.
Artigo em Inglês | MEDLINE | ID: mdl-32201450

RESUMO

Osteoporosis is a common age-related disease characterized by reduced bone density and increased fracture-risk. Microstructural quality of trabecular bone (Tb), commonly found at axial skeletal sites and at the end of long bones, is an important determinant of bone-strength and fracture-risk. High-resolution emerging CT scanners enable in vivo measurement of Tb microstructures at peripheral sites. However, resolution-dependence of microstructural measures and wide resolution-discrepancies among various CT scanners together with rapid upgrades in technology warrant data harmonization in CT-based cross-sectional and longitudinal bone studies. This paper presents a deep learning-based method for high-resolution reconstruction of Tb microstructures from low-resolution CT scans using GAN-CIRCLE. A network was developed and evaluated using post-registered ankle CT scans of nineteen volunteers on both low- and high-resolution CT scanners. 9,000 matching pairs of low- and high-resolution patches of size 64×64 were randomly harvested from ten volunteers for training and validation. Another 5,000 matching pairs of patches from nine other volunteers were used for evaluation. Quantitative comparison shows that predicted high-resolution scans have significantly improved structural similarity index (p < 0.01) with true high-resolution scans as compared to the same metric for low-resolution data. Different Tb microstructural measures such as thickness, spacing, and network area density are also computed from low- and predicted high-resolution images, and compared with the values derived from true high-resolution scans. Thickness and network area measures from predicted images showed higher agreement with true high-resolution CT (CCC = [0.95, 0.91]) derived values than the same measures from low-resolution images (CCC = [0.72, 0.88]).

20.
IEEE Trans Med Imaging ; 39(1): 188-203, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31217097

RESUMO

In this paper, we present a semi-supervised deep learning approach to accurately recover high-resolution (HR) CT images from low-resolution (LR) counterparts. Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs. We also include the joint constraints in the loss function to facilitate structural preservation. In this process, we incorporate deep convolutional neural network (CNN), residual learning, and network in network techniques for feature extraction and restoration. In contrast to the current trend of increasing network depth and complexity to boost the imaging performance, we apply a parallel 1×1 CNN to compress the output of the hidden layer and optimize the number of layers and the number of filters for each convolutional layer. The quantitative and qualitative evaluative results demonstrate that our proposed model is accurate, efficient and robust for super-resolution (SR) image restoration from noisy LR input images. In particular, we validate our composite SR networks on three large-scale CT datasets, and obtain promising results as compared to the other state-of-the-art methods.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Tomografia Computadorizada por Raios X/métodos , Abdome/diagnóstico por imagem , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Tíbia/diagnóstico por imagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...