Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Quant Imaging Med Surg ; 13(10): 6989-7001, 2023 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-37869278

RESUMO

Background: Surgical action recognition is an essential technology in context-aware-based autonomous surgery, whereas the accuracy is limited by clinical dataset scale. Leveraging surgical videos from virtual reality (VR) simulations to research algorithms for the clinical domain application, also known as domain adaptation, can effectively reduce the cost of data acquisition and annotation, and protect patient privacy. Methods: We introduced a surgical domain adaptation method based on the contrastive language-image pretraining model (SDA-CLIP) to recognize cross-domain surgical action. Specifically, we utilized the Vision Transformer (ViT) and Transformer to extract video and text embeddings, respectively. Text embedding was developed as a bridge between VR and clinical domains. Inter- and intra-modality loss functions were employed to enhance the consistency of embeddings of the same class. Further, we evaluated our method on the MICCAI 2020 EndoVis Challenge SurgVisDom dataset. Results: Our SDA-CLIP achieved a weighted F1-score of 65.9% (+18.9%) on the hard domain adaptation task (trained only with VR data) and 84.4% (+4.4%) on the soft domain adaptation task (trained with VR and clinical-like data), which outperformed the first place team of the challenge by a significant margin. Conclusions: The proposed SDA-CLIP model can effectively extract video scene information and textual semantic information, which greatly improves the performance of cross-domain surgical action recognition. The code is available at https://github.com/Lycus99/SDA-CLIP.

2.
Int J Comput Assist Radiol Surg ; 18(1): 149-156, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35984606

RESUMO

PURPOSE: CycleGAN and its variants are widely used in medical image synthesis, which can use unpaired data for medical image synthesis. The most commonly used method is to use a Generative Adversarial Network (GAN) model to process 2D slices and thereafter concatenate all of these slices to 3D medical images. Nevertheless, these methods always bring about spatial inconsistencies in contiguous slices. We offer a new model based on the CycleGAN to work out this problem, which can achieve high-quality conversion from magnetic resonance (MR) to computed tomography (CT) images. METHODS: To achieve spatial consistencies of 3D medical images and avoid the memory-heavy 3D convolutions, we reorganized the adjacent 3 slices into a 2.5D slice as the input image. Further, we propose a U-Net discriminator network to improve accuracy, which can perceive input objects locally and globally. Then, the model uses Content-Aware ReAssembly of Features (CARAFE) upsampling, which has a large field of view and content awareness takes the place of using a settled kernel for all samples. RESULTS: The mean absolute error (MAE), peak-signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) for double U-Net CycleGAN generated 3D image synthesis are 74.56±10.02, 27.12±0.71 and 0.84±0.03, respectively. Our method achieves preferable results than state-of-the-art methods. CONCLUSION: The experiment results indicate our method can realize the conversion of MR to CT images using ill-sorted pair data, and achieves preferable results than state-of-the-art methods. Compared with 3D CycleGAN, it can synthesize better 3D CT images with less computation and memory.


Assuntos
Aprendizado Profundo , Humanos , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Imageamento por Ressonância Magnética , Espectroscopia de Ressonância Magnética
3.
Comput Biol Med ; 130: 104183, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33360107

RESUMO

PURPOSE: Multiscale feature fusion is a feasible method to improve tumor segmentation accuracy. However, current multiscale networks have two common problems: 1. Some networks only allow feature fusion between encoders and decoders of the same scale. It is obvious that such feature fusion is not sufficient. 2. Some networks have too many dense skip connections and too much nesting between the coding layer and the decoding layer, which causes some features to be lost and means that not enough information will be learned from multiple scales. To overcome these two problems, we propose a multiscale double-channel convolution U-Net (MDCC-Net) framework for colorectal tumor segmentation. METHODS: In the coding layer, we designed a dual-channel separation and convolution module and then added residual connections to perform multiscale feature fusion on the input image and the feature map after dual-channel separation and convolution. By fusing features at different scales in the same coding layer, the network can fully extract the detailed information of the original image and learn more tumor boundary information. RESULTS: The segmentation results show that our proposed method has a high accuracy, with a Dice similarity coefficient (DSC) of 83.57%, which is an improvement of 9.59%, 6.42%, and 1.57% compared with nnU-Net, U-Net, and U-Net++, respectively. CONCLUSION: The experimental results show that our proposed method has good performance in the segmentation of colorectal tumors and is close to the expert level. The proposed method has potential clinical applicability.


Assuntos
Neoplasias Colorretais , Processamento de Imagem Assistida por Computador , Neoplasias Colorretais/diagnóstico por imagem , Humanos , Aprendizagem , Redes Neurais de Computação , Tomografia Computadorizada por Raios X
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...