Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 553
Filter
1.
Article in English | MEDLINE | ID: mdl-39019048

ABSTRACT

Precise segmentation for skin cancer lesions at different stages is conducive to early detection and further treatment. Considering the huge cost of obtaining pixel-perfect annotations for this task, segmentation using less expensive image-level labels has become a research direction. Most image-level label weakly supervised segmentation uses class activation mapping (CAM) methods. A common consequence of this method is incomplete foreground segmentation, insufficient segmentation, or false negatives. At the same time, when performing weakly supervised segmentation of skin cancer lesions, ulcers, redness, and swelling may appear near the segmented areas of individual disease categories. This co-occurrence problem affects the model's accuracy in segmenting class-related tissue boundaries to a certain extent. The above two issues are determined by the loosely constrained nature of image-level labels that penalize the entire image space. Therefore, providing pixel-level constraints for weak supervision of image-level labels is the key to improving performance. To solve the above problems, this paper proposes a joint unsupervised constraint-assisted weakly supervised segmentation model(UCA-WSS). The weakly supervised part of the model adopts a dual-branch adversarial erasure mechanism to generate higher-quality CAM. The unsupervised part uses contrastive learning and clustering algorithms to generate foreground labels and fine boundary labels to assist segmentation and solve common co-occurrence problems in weakly supervised skin cancer lesion segmentation through unsupervised constraints. The model proposed in the article is evaluated comparatively with other related models on some public dermatology data sets. Experimental results show that our model performs better on the skin cancer segmentation task than other weakly supervised segmentation models, showing the potential of combining unsupervised constraint methods on weakly supervised segmentation.

2.
J Imaging Inform Med ; 2024 Jul 17.
Article in English | MEDLINE | ID: mdl-39020158

ABSTRACT

Wound management requires the measurement of the wound parameters such as its shape and area. However, computerized analysis of the wound suffers the challenge of inexact segmentation of the wound images due to limited or inaccurate labels. It is a common scenario that the source domain provides an abundance of labeled data, while the target domain provides only limited labels. To overcome this, we propose a novel approach that combines self-training learning and mixup augmentation. The neural network is trained on the source domain to generate weak labels on the target domain via the self-training process. In the second stage, generated labels are mixed up with labels from the source domain to retrain the neural network and enhance generalization across diverse datasets. The efficacy of our approach was evaluated using the DFUC 2022, FUSeg, and RMIT datasets, demonstrating substantial improvements in segmentation accuracy and robustness across different data distributions. Specifically, in single-domain experiments, segmentation on the DFUC 2022 dataset scored a dice score of 0.711, while the score on the FUSeg dataset achieved 0.859. For domain adaptation, when these datasets were used as target datasets, the dice scores were 0.714 for DFUC 2022 and 0.561 for FUSeg.

3.
Front Bioeng Biotechnol ; 12: 1414605, 2024.
Article in English | MEDLINE | ID: mdl-38994123

ABSTRACT

In recent years, deep convolutional neural network-based segmentation methods have achieved state-of-the-art performance for many medical analysis tasks. However, most of these approaches rely on optimizing the U-Net structure or adding new functional modules, which overlooks the complementation and fusion of coarse-grained and fine-grained semantic information. To address these issues, we propose a 2D medical image segmentation framework called Progressive Learning Network (PL-Net), which comprises Internal Progressive Learning (IPL) and External Progressive Learning (EPL). PL-Net offers the following advantages: 1) IPL divides feature extraction into two steps, allowing for the mixing of different size receptive fields and capturing semantic information from coarse to fine granularity without introducing additional parameters; 2) EPL divides the training process into two stages to optimize parameters and facilitate the fusion of coarse-grained information in the first stage and fine-grained information in the second stage. We conducted comprehensive evaluations of our proposed method on five medical image segmentation datasets, and the experimental results demonstrate that PL-Net achieves competitive segmentation performance. It is worth noting that PL-Net does not introduce any additional learnable parameters compared to other U-Net variants.

4.
Sensors (Basel) ; 24(13)2024 Jun 30.
Article in English | MEDLINE | ID: mdl-39001046

ABSTRACT

Retinal vessel segmentation is crucial for diagnosing and monitoring various eye diseases such as diabetic retinopathy, glaucoma, and hypertension. In this study, we examine how sharpness-aware minimization (SAM) can improve RF-UNet's generalization performance. RF-UNet is a novel model for retinal vessel segmentation. We focused our experiments on the digital retinal images for vessel extraction (DRIVE) dataset, which is a benchmark for retinal vessel segmentation, and our test results show that adding SAM to the training procedure leads to notable improvements. Compared to the non-SAM model (training loss of 0.45709 and validation loss of 0.40266), the SAM-trained RF-UNet model achieved a significant reduction in both training loss (0.094225) and validation loss (0.08053). Furthermore, compared to the non-SAM model (training accuracy of 0.90169 and validation accuracy of 0.93999), the SAM-trained model demonstrated higher training accuracy (0.96225) and validation accuracy (0.96821). Additionally, the model performed better in terms of sensitivity, specificity, AUC, and F1 score, indicating improved generalization to unseen data. Our results corroborate the notion that SAM facilitates the learning of flatter minima, thereby improving generalization, and are consistent with other research highlighting the advantages of advanced optimization methods. With wider implications for other medical imaging tasks, these results imply that SAM can successfully reduce overfitting and enhance the robustness of retinal vessel segmentation models. Prospective research avenues encompass verifying the model on vaster and more diverse datasets and investigating its practical implementation in real-world clinical situations.


Subject(s)
Algorithms , Retinal Vessels , Humans , Retinal Vessels/diagnostic imaging , Image Processing, Computer-Assisted/methods , Diabetic Retinopathy/diagnostic imaging
5.
Sensors (Basel) ; 24(13)2024 Jul 03.
Article in English | MEDLINE | ID: mdl-39001109

ABSTRACT

Elbow computerized tomography (CT) scans have been widely applied for describing elbow morphology. To enhance the objectivity and efficiency of clinical diagnosis, an automatic method to recognize, segment, and reconstruct elbow joint bones is proposed in this study. The method involves three steps: initially, the humerus, ulna, and radius are automatically recognized based on the anatomical features of the elbow joint, and the prompt boxes are generated. Subsequently, elbow MedSAM is obtained through transfer learning, which accurately segments the CT images by integrating the prompt boxes. After that, hole-filling and object reclassification steps are executed to refine the mask. Finally, three-dimensional (3D) reconstruction is conducted seamlessly using the marching cube algorithm. To validate the reliability and accuracy of the method, the images were compared to the masks labeled by senior surgeons. Quantitative evaluation of segmentation results revealed median intersection over union (IoU) values of 0.963, 0.959, and 0.950 for the humerus, ulna, and radius, respectively. Additionally, the reconstructed surface errors were measured at 1.127, 1.523, and 2.062 mm, respectively. Consequently, the automatic elbow reconstruction method demonstrates promising capabilities in clinical diagnosis, preoperative planning, and intraoperative navigation for elbow joint diseases.


Subject(s)
Algorithms , Elbow Joint , Imaging, Three-Dimensional , Tomography, X-Ray Computed , Humans , Elbow Joint/diagnostic imaging , Tomography, X-Ray Computed/methods , Imaging, Three-Dimensional/methods , Image Processing, Computer-Assisted/methods , Radius/diagnostic imaging , Ulna/diagnostic imaging , Humerus/diagnostic imaging
6.
Neural Netw ; 178: 106489, 2024 Jun 22.
Article in English | MEDLINE | ID: mdl-38959598

ABSTRACT

Medical image segmentation is crucial for understanding anatomical or pathological changes, playing a key role in computer-aided diagnosis and advancing intelligent healthcare. Currently, important issues in medical image segmentation need to be addressed, particularly the problem of segmenting blurry edge regions and the generalizability of segmentation models. Therefore, this study focuses on different medical image segmentation tasks and the issue of blurriness. By addressing these tasks, the study significantly improves diagnostic efficiency and accuracy, contributing to the overall enhancement of healthcare outcomes. To optimize segmentation performance and leverage feature information, we propose a Neighborhood Fuzzy c-Means Multiscale Pyramid Hybrid Attention Unet (NFMPAtt-Unet) model. NFMPAtt-Unet comprises three core components: the Multiscale Dynamic Weight Feature Pyramid module (MDWFP), the Hybrid Weighted Attention mechanism (HWA), and the Neighborhood Rough Set-based Fuzzy c-Means Feature Extraction module (NFCMFE). The MDWFP dynamically adjusts weights across multiple scales, improving feature information capture. The HWA enhances the network's ability to capture and utilize crucial features, while the NFCMFE, grounded in neighborhood rough set concepts, aids in fuzzy C-means feature extraction, addressing complex structures and uncertainties in medical images, thereby enhancing adaptability. Experimental results demonstrate that NFMPAtt-Unet outperforms state-of-the-art models, highlighting its efficacy in medical image segmentation.

7.
Front Oncol ; 14: 1396887, 2024.
Article in English | MEDLINE | ID: mdl-38962265

ABSTRACT

Pathological images are considered the gold standard for clinical diagnosis and cancer grading. Automatic segmentation of pathological images is a fundamental and crucial step in constructing powerful computer-aided diagnostic systems. Medical microscopic hyperspectral pathological images can provide additional spectral information, further distinguishing different chemical components of biological tissues, offering new insights for accurate segmentation of pathological images. However, hyperspectral pathological images have higher resolution and larger area, and their annotation requires more time and clinical experience. The lack of precise annotations limits the progress of research in pathological image segmentation. In this paper, we propose a novel semi-supervised segmentation method for microscopic hyperspectral pathological images based on multi-consistency learning (MCL-Net), which combines consistency regularization methods with pseudo-labeling techniques. The MCL-Net architecture employs a shared encoder and multiple independent decoders. We introduce a Soft-Hard pseudo-label generation strategy in MCL-Net to generate pseudo-labels that are closer to real labels for pathological images. Furthermore, we propose a multi-consistency learning strategy, treating pseudo-labels generated by the Soft-Hard process as real labels, by promoting consistency between predictions of different decoders, enabling the model to learn more sample features. Extensive experiments in this paper demonstrate the effectiveness of the proposed method, providing new insights for the segmentation of microscopic hyperspectral tissue pathology images.

8.
Cognit Comput ; 16(4): 2063-2077, 2024.
Article in English | MEDLINE | ID: mdl-38974012

ABSTRACT

Automated segmentation of multiple organs and tumors from 3D medical images such as magnetic resonance imaging (MRI) and computed tomography (CT) scans using deep learning methods can aid in diagnosing and treating cancer. However, organs often overlap and are complexly connected, characterized by extensive anatomical variation and low contrast. In addition, the diversity of tumor shape, location, and appearance, coupled with the dominance of background voxels, makes accurate 3D medical image segmentation difficult. In this paper, a novel 3D large-kernel (LK) attention module is proposed to address these problems to achieve accurate multi-organ segmentation and tumor segmentation. The advantages of biologically inspired self-attention and convolution are combined in the proposed LK attention module, including local contextual information, long-range dependencies, and channel adaptation. The module also decomposes the LK convolution to optimize the computational cost and can be easily incorporated into CNNs such as U-Net. Comprehensive ablation experiments demonstrated the feasibility of convolutional decomposition and explored the most efficient and effective network design. Among them, the best Mid-type 3D LK attention-based U-Net network was evaluated on CT-ORG and BraTS 2020 datasets, achieving state-of-the-art segmentation performance when compared to avant-garde CNN and Transformer-based methods for medical image segmentation. The performance improvement due to the proposed 3D LK attention module was statistically validated.

9.
Med Biol Eng Comput ; 2024 Jul 20.
Article in English | MEDLINE | ID: mdl-39031327

ABSTRACT

Data-driven medical image segmentation networks require expert annotations, which are hard to obtain. Non-expert annotations are often used instead, but these can be inaccurate (referred to as "noisy labels"), misleading the network's training and causing a decline in segmentation performance. In this study, we focus on improving the segmentation performance of neural networks when trained with noisy annotations. Specifically, we propose a two-stage framework named "G-T correcting," consisting of "G" stage for recognizing noisy labels and "T" stage for correcting noisy labels. In the "G" stage, a positive feedback method is proposed to automatically recognize noisy samples, using a Gaussian mixed model to classify clean and noisy labels through the per-sample loss histogram. In the "T" stage, a confident correcting strategy and early learning strategy are adopted to allow the segmentation network to receive productive guidance from noisy labels. Experiments on simulated and real-world noisy labels show that this method can achieve over 90% accuracy in recognizing noisy labels, and improve the network's DICE coefficient to 91%. The results demonstrate that the proposed method can enhance the segmentation performance of the network when trained with noisy labels, indicating good clinical application prospects.

10.
Phys Med Biol ; 69(14)2024 Jul 11.
Article in English | MEDLINE | ID: mdl-38959911

ABSTRACT

Objective.In recent years, convolutional neural networks, which typically focus on extracting spatial domain features, have shown limitations in learning global contextual information. However, frequency domain can offer a global perspective that spatial domain methods often struggle to capture. To address this limitation, we propose FreqSNet, which leverages both frequency and spatial features for medical image segmentation.Approach.To begin, we propose a frequency-space representation aggregation block (FSRAB) to replace conventional convolutions. FSRAB contains three frequency domain branches to capture global frequency information along different axial combinations, while a convolutional branch is designed to interact information across channels in local spatial features. Secondly, the multiplex expansion attention block extracts long-range dependency information using dilated convolutional blocks, while suppressing irrelevant information via attention mechanisms. Finally, the introduced Feature Integration Block enhances feature representation by integrating semantic features that fuse spatial and channel positional information.Main results.We validated our method on 5 public datasets, including BUSI, CVC-ClinicDB, CVC-ColonDB, ISIC-2018, and Luna16. On these datasets, our method achieved Intersection over Union (IoU) scores of 75.46%, 87.81%, 79.08%, 84.04%, and 96.99%, and Hausdorff distance values of 22.22 mm, 13.20 mm, 13.08 mm, 13.51 mm, and 5.22 mm, respectively. Compared to other state-of-the-art methods, our FreqSNet achieves better segmentation results.Significance.Our method can effectively combine frequency domain information with spatial domain features, enhancing the segmentation performance and generalization capability in medical image segmentation tasks.


Subject(s)
Image Processing, Computer-Assisted , Image Processing, Computer-Assisted/methods , Humans , Neural Networks, Computer
11.
Quant Imaging Med Surg ; 14(7): 5176-5204, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-39022282

ABSTRACT

Background and Objective: Cervical cancer clinical target volume (CTV) outlining and organs at risk segmentation are crucial steps in the diagnosis and treatment of cervical cancer. Manual segmentation is inefficient and subjective, leading to the development of automated or semi-automated methods. However, limitation of image quality, organ motion, and individual differences still pose significant challenges. Apart from numbers of studies on the medical images' segmentation, a comprehensive review within the field is lacking. The purpose of this paper is to comprehensively review the literatures on different types of medical image segmentation regarding cervical cancer and discuss the current level and challenges in segmentation process. Methods: As of May 31, 2023, we conducted a comprehensive literature search on Google Scholar, PubMed, and Web of Science using the following term combinations: "cervical cancer images", "segmentation", and "outline". The included studies focused on the segmentation of cervical cancer utilizing computed tomography (CT), magnetic resonance (MR), and positron emission tomography (PET) images, with screening for eligibility by two independent investigators. Key Content and Findings: This paper reviews representative papers on CTV and organs at risk segmentation in cervical cancer and classifies the methods into three categories based on image modalities. The traditional or deep learning methods are comprehensively described. The similarities and differences of related methods are analyzed, and their advantages and limitations are discussed in-depth. We have also included experimental results by using our private datasets to verify the performance of selected methods. The results indicate that the residual module and squeeze-and-excitation blocks module can significantly improve the performance of the model. Additionally, the segmentation method based on improved level set demonstrates better segmentation accuracy than other methods. Conclusions: The paper provides valuable insights into the current state-of-the-art in cervical cancer CTV outlining and organs at risk segmentation, highlighting areas for future research.

12.
Comput Med Imaging Graph ; 116: 102406, 2024 May 28.
Article in English | MEDLINE | ID: mdl-38824715

ABSTRACT

Lack of data is one of the biggest hurdles for rare disease research using deep learning. Due to the lack of rare-disease images and annotations, training a robust network for automatic rare-disease image segmentation is very challenging. To address this challenge, few-shot domain adaptation (FSDA) has emerged as a practical research direction, aiming to leverage a limited number of annotated images from a target domain to facilitate adaptation of models trained on other large datasets in a source domain. In this paper, we present a novel prototype-based feature mapping network (PFMNet) designed for FSDA in medical image segmentation. PFMNet adopts an encoder-decoder structure for segmentation, with the prototype-based feature mapping (PFM) module positioned at the bottom of the encoder-decoder structure. The PFM module transforms high-level features from the target domain into the source domain-like features that are more easily comprehensible by the decoder. By leveraging these source domain-like features, the decoder can effectively perform few-shot segmentation in the target domain and generate accurate segmentation masks. We evaluate the performance of PFMNet through experiments on three typical yet challenging few-shot medical image segmentation tasks: cross-center optic disc/cup segmentation, cross-center polyp segmentation, and cross-modality cardiac structure segmentation. We consider four different settings: 5-shot, 10-shot, 15-shot, and 20-shot. The experimental results substantiate the efficacy of our proposed approach for few-shot domain adaptation in medical image segmentation.

13.
Front Bioeng Biotechnol ; 12: 1398237, 2024.
Article in English | MEDLINE | ID: mdl-38827037

ABSTRACT

Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional U-Net architectures and their transformer-integrated variants excel in automated segmentation tasks. Existing models also struggle with parameter efficiency and computational complexity, often due to the extensive use of Transformers. However, they lack the ability to harness the image's intrinsic position and channel features. Research employing Dual Attention mechanisms of position and channel have not been specifically optimized for the high-detail demands of medical images. To address these issues, this study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to integrate the Transformer and dual attention block (DA-Block) into the traditional U-shaped architecture. Also, DA-TransUNet tailored for the high-detail requirements of medical images, optimizes the intermittent channels of Dual Attention (DA) and employs DA in each skip-connection to effectively filter out irrelevant information. This integration significantly enhances the model's capability to extract features, thereby improving the performance of medical image segmentation. DA-TransUNet is validated in medical image segmentation tasks, consistently outperforming state-of-the-art techniques across 5 datasets. In summary, DA-TransUNet has made significant strides in medical image segmentation, offering new insights into existing techniques. It strengthens model performance from the perspective of image features, thereby advancing the development of high-precision automated medical image diagnosis. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.

14.
Med Image Anal ; 97: 103241, 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38897032

ABSTRACT

Although the U-shape networks have achieved remarkable performances in many medical image segmentation tasks, they rarely model the sequential relationship of hierarchical layers. This weakness makes it difficult for the current layer to effectively utilize the historical information of the previous layer, leading to unsatisfactory segmentation results for lesions with blurred boundaries and irregular shapes. To solve this problem, we propose a novel dual-path U-Net, dubbed I2U-Net. The newly proposed network encourages historical information re-usage and re-exploration through rich information interaction among the dual paths, allowing deep layers to learn more comprehensive features that contain both low-level detail description and high-level semantic abstraction. Specifically, we introduce a multi-functional information interaction module (MFII), which can model cross-path, cross-layer, and cross-path-and-layer information interactions via a unified design, making the proposed I2U-Net behave similarly to an unfolded RNN and enjoying its advantage of modeling time sequence information. Besides, to further selectively and sensitively integrate the information extracted by the encoder of the dual paths, we propose a holistic information fusion and augmentation module (HIFA), which can efficiently bridge the encoder and the decoder. Extensive experiments on four challenging tasks, including skin lesion, polyp, brain tumor, and abdominal multi-organ segmentation, consistently show that the proposed I2U-Net has superior performance and generalization ability over other state-of-the-art methods. The code is available at https://github.com/duweidai/I2U-Net.

15.
Comput Biol Med ; 178: 108780, 2024 Jun 22.
Article in English | MEDLINE | ID: mdl-38909447

ABSTRACT

Colon adenocarcinoma (COAD) is a type of colon cancers with a high mortality rate. Its early symptoms are not obvious, and its late stage is accompanied by various complications that seriously endanger patients' lives. To assist in the early diagnosis of COAD and improve the detection efficiency of COAD, this paper proposes a multi-level threshold image segmentation (MIS) method based on an enhanced particle swarm algorithm for segmenting COAD images. Firstly, this paper proposes a multi-strategy fusion particle swarm optimization algorithm (DRPSO) with a replacement mechanism. The non-linear inertia weight and sine-cosine learning factors in DRPSO help balance the exploration and exploitation phases of the algorithm. The population reorganization strategy incorporating MGO enhances population diversity and effectively prevents the algorithm from stagnating prematurely. The mutation-based final replacement mechanism enhances the algorithm's ability to escape local optima and helps the algorithm to obtain highly accurate solutions. In addition, comparison experiments on the CEC2020 and CEC2022 test sets show that DRPSO outperforms other state-of-the-art algorithms in terms of convergence accuracy and speed. Secondly, by combining the non-local mean 2D histogram and 2D Renyi entropy, this paper proposes a DRPSO algorithm based MIS method, which is successfully applied to the segments the COAD pathology image problem. The results of segmentation experiments show that the above method obtains relatively higher quality segmented images with superior performance metrics: PSNR = 23.556, SSIM = 0.825, and FSIM = 0.922. In conclusion, the MIS method based on the DRPSO algorithm shows great potential in assisting COAD diagnosis and in pathology image segmentation.

16.
Comput Biol Med ; 178: 108784, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38941900

ABSTRACT

Characteristics such as low contrast and significant organ shape variations are often exhibited in medical images. The improvement of segmentation performance in medical imaging is limited by the generally insufficient adaptive capabilities of existing attention mechanisms. An efficient Channel Prior Convolutional Attention (CPCA) method is proposed in this paper, supporting the dynamic distribution of attention weights in both channel and spatial dimensions. Spatial relationships are effectively extracted while preserving the channel prior by employing a multi-scale depth-wise convolutional module. The ability to focus on informative channels and important regions is possessed by CPCA. A segmentation network called CPCANet for medical image segmentation is proposed based on CPCA. CPCANet is validated on two publicly available datasets. Improved segmentation performance is achieved by CPCANet while requiring fewer computational resources through comparisons with state-of-the-art algorithms. Our code is publicly available at https://github.com/Cuthbert-Huang/CPCANet.

17.
Article in English | MEDLINE | ID: mdl-38922721

ABSTRACT

OBJECTIVE: Segmentation, the partitioning of patient imaging into multiple, labeled segments, has several potential clinical benefits but when performed manually is tedious and resource intensive. Automated deep learning (DL)-based segmentation methods can streamline the process. The objective of this study was to evaluate a label-efficient DL pipeline that requires only a small number of annotated scans for semantic segmentation of sinonasal structures in CT scans. STUDY DESIGN: Retrospective cohort study. SETTING: Academic institution. METHODS: Forty CT scans were used in this study including 16 scans in which the nasal septum (NS), inferior turbinate (IT), maxillary sinus (MS), and optic nerve (ON) were manually annotated using an open-source software. A label-efficient DL framework was used to train jointly on a few manually labeled scans and the remaining unlabeled scans. Quantitative analysis was then performed to obtain the number of annotated scans needed to achieve submillimeter average surface distances (ASDs). RESULTS: Our findings reveal that merely four labeled scans are necessary to achieve median submillimeter ASDs for large sinonasal structures-NS (0.96 mm), IT (0.74 mm), and MS (0.43 mm), whereas eight scans are required for smaller structures-ON (0.80 mm). CONCLUSION: We have evaluated a label-efficient pipeline for segmentation of sinonasal structures. Empirical results demonstrate that automated DL methods can achieve submillimeter accuracy using a small number of labeled CT scans. Our pipeline has the potential to improve pre-operative planning workflows, robotic- and image-guidance navigation systems, computer-assisted diagnosis, and the construction of statistical shape models to quantify population variations. LEVEL OF EVIDENCE: N/A.

18.
Comput Biol Med ; 178: 108773, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38925090

ABSTRACT

Extracting global and local feature information is still challenging due to the problems of retinal blood vessel medical images like fuzzy edge features, noise, difficulty in distinguishing between lesion regions and background information, and loss of low-level feature information, which leads to insufficient extraction of feature information. To better solve these problems and fully extract the global and local feature information of the image, we propose a novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation, which consists of an encoder and a decoder and is connected between the encoder and decoder by a transscale transformer cascade module. Among them, the encoder consists of a local-global transscale transformer module, a multi-head layered transscale adaptive embedding module, and a local context(LCNet) module. The transscale transformer cascade module learns local and global feature information from the first three layers of the encoder, and multi-scale dependent features, fuses the hierarchical feature information from the skip connection block and the channel-token interaction fusion block, respectively, and inputs it to the decoder. The decoder includes a decoding module for the local context network and a transscale position transformer module to input the local and global feature information extracted from the encoder with retained key position information into the decoding module and the position embedding transformer module for recovery and output of the prediction results that are consistent with the input feature information. In addition, we propose an improved cross-entropy loss function based on the difference between the deterministic observation samples and the prediction results with the deviation distance, which is validated on the DRIVE and STARE datasets combined with the proposed network model based on the dual transformer structure in this paper, and the segmentation accuracies are 97.26% and 97.87%, respectively. Compared with other state-of-the-art networks, the results show that the proposed network model has a significant competitive advantage in improving the segmentation performance of retinal blood vessel images.

19.
Diagnostics (Basel) ; 14(12)2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38928629

ABSTRACT

Deep learning has attained state-of-the-art results in general image segmentation problems; however, it requires a substantial number of annotated images to achieve the desired outcomes. In the medical field, the availability of annotated images is often limited. To address this challenge, few-shot learning techniques have been successfully adapted to rapidly generalize to new tasks with only a few samples, leveraging prior knowledge. In this paper, we employ a gradient-based method known as Model-Agnostic Meta-Learning (MAML) for medical image segmentation. MAML is a meta-learning algorithm that quickly adapts to new tasks by updating a model's parameters based on a limited set of training samples. Additionally, we use an enhanced 3D U-Net as the foundational network for our models. The enhanced 3D U-Net is a convolutional neural network specifically designed for medical image segmentation. We evaluate our approach on the TotalSegmentator dataset, considering a few annotated images for four tasks: liver, spleen, right kidney, and left kidney. The results demonstrate that our approach facilitates rapid adaptation to new tasks using only a few annotated images. In 10-shot settings, our approach achieved mean dice coefficients of 93.70%, 85.98%, 81.20%, and 89.58% for liver, spleen, right kidney, and left kidney segmentation, respectively. In five-shot sittings, the approach attained mean Dice coefficients of 90.27%, 83.89%, 77.53%, and 87.01% for liver, spleen, right kidney, and left kidney segmentation, respectively. Finally, we assess the effectiveness of our proposed approach on a dataset collected from a local hospital. Employing five-shot sittings, we achieve mean Dice coefficients of 90.62%, 79.86%, 79.87%, and 78.21% for liver, spleen, right kidney, and left kidney segmentation, respectively.

20.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(3): 511-519, 2024 Jun 25.
Article in Chinese | MEDLINE | ID: mdl-38932537

ABSTRACT

In response to the issues of single-scale information loss and large model parameter size during the sampling process in U-Net and its variants for medical image segmentation, this paper proposes a multi-scale medical image segmentation method based on pixel encoding and spatial attention. Firstly, by redesigning the input strategy of the Transformer structure, a pixel encoding module is introduced to enable the model to extract global semantic information from multi-scale image features, obtaining richer feature information. Additionally, deformable convolutions are incorporated into the Transformer module to accelerate convergence speed and improve module performance. Secondly, a spatial attention module with residual connections is introduced to allow the model to focus on the foreground information of the fused feature maps. Finally, through ablation experiments, the network is lightweighted to enhance segmentation accuracy and accelerate model convergence. The proposed algorithm achieves satisfactory results on the Synapse dataset, an official public dataset for multi-organ segmentation provided by the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), with Dice similarity coefficient (DSC) and 95% Hausdorff distance (HD95) scores of 77.65 and 18.34, respectively. The experimental results demonstrate that the proposed algorithm can enhance multi-organ segmentation performance, potentially filling the gap in multi-scale medical image segmentation algorithms, and providing assistance for professional physicians in diagnosis.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Humans , Image Processing, Computer-Assisted/methods , Diagnostic Imaging/methods , Neural Networks, Computer
SELECTION OF CITATIONS
SEARCH DETAIL
...