Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.258
Filter
1.
Med Dosim ; 2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39389801

ABSTRACT

PURPOSE: Head and Neck (H&N) cancer accounts for 3% of cancer cases in the United States. Precise tumor segmentation in H&N is of utmost importance for treatment planning and administering personalized treatment dose. We aimed to develop an automatic tumor localization and segmentation method in enhancing the clinical efficiency and ultimately improving treatment outcomes. APPROACH: In this study, a hybrid neural network (HNN) was developed by integrating object localization and segmentation into a unified framework. It consists of 4 stages: preprocessing, HNN training, object localization and segmentation, and postprocessing. We utilized a dataset consisting of PET and CT images for 48 patients and designed a Hybrid Neural Network (HNN) which consists of YOLOv4 object detection model + U-Net model for image segmentation. YOLOv4 was used to identify regions of interests (ROI), while the U-Net was employed for the precise image segmentation. In our experiments we considered 2 object detection architectures to identify possible tumor regions, namely YOLOv4 and Faster-RCNN. The evaluation metrics for both were evaluated and compared. RESULTS: We evaluated the performance of 3 model combinations: YOLOv4 + U-Net, Faster-RCNN + U-Net, and U-Net alone. The models were evaluated based on Sensitivity, Specificity, F-Score, and Intersection over Union (IoU). YOLOv4 + U-Net achieved the best values with Sensitivity of 0.89, Specificity of 0.99, F-Score of 0.84, and IoU of 0.72. CONCLUSION: A new hybrid neural network (HNN) for fully automatic tumor localization and segmentation was developed and the experimental results. showcased the HNN's impressive performance, indicating its potential to be a valuable H&N tumor segmentation tool.

2.
Sci Rep ; 14(1): 23729, 2024 10 10.
Article in English | MEDLINE | ID: mdl-39390053

ABSTRACT

Accurate segmentation of COVID-19 lesions from medical images is essential for achieving precise diagnosis and developing effective treatment strategies. Unfortunately, this task presents significant challenges, owing to the complex and diverse characteristics of opaque areas, subtle differences between infected and healthy tissue, and the presence of noise in CT images. To address these difficulties, this paper designs a new deep-learning architecture (named MD-Net) based on multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation. In our framework, the U-shaped structure serves as the cornerstone to facilitate complex hierarchical representations essential for accurate segmentation. Then, by introducing the multi-scale input layers (MIL), the network can effectively analyze both fine-grained details and contextual information in the original image. Furthermore, we introduce an SE-Conv module in the encoder network, which can enhance the ability to identify relevant information while simultaneously suppressing the transmission of extraneous or non-lesion information. Additionally, we design a dense decoder aggregation (DDA) module to integrate feature distributions and important COVID-19 lesion information from adjacent encoder layers. Finally, we conducted a comprehensive quantitative analysis and comparison between two publicly available datasets, namely Vid-QU-EX and QaTa-COV19-v2, to assess the robustness and versatility of MD-Net in segmenting COVID-19 lesions. The experimental results show that the proposed MD-Net has superior performance compared to its competitors, and it exhibits higher scores on the Dice value, Matthews correlation coefficient (Mcc), and Jaccard index. In addition, we also conducted ablation studies on the Vid-QU-EX dataset to evaluate the contributions of each key component within the proposed architecture.


Subject(s)
COVID-19 , Deep Learning , SARS-CoV-2 , Tomography, X-Ray Computed , COVID-19/diagnostic imaging , COVID-19/virology , Humans , Tomography, X-Ray Computed/methods , Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Algorithms
3.
Sci Rep ; 14(1): 23641, 2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39384820

ABSTRACT

In low-level image processing, where the main goal is to reconstruct a clean image from a noise-corrupted version, image denoising continues to be a critical challenge. Although recent developments have led to the introduction of complex architectures to improve denoising performance, these models frequently have more parameters and higher computational demands. Here, we propose a new, simplified architecture called KU-Net, which is intended to achieve better denoising performance while requiring less complexity. KU-Net is an extension of the basic U-Net architecture that incorporates gradient information and noise residue from a Kalman filter. The network's ability to learn is improved by this deliberate incorporation, which also helps it better preserve minute details in the denoised images. Without using Image augmentation, the proposed model is trained on a limited dataset to show its resilience in restricted training settings. Three essential inputs are processed by the architecture: gradient estimations, the predicted noisy image, and the original noisy grey image. These inputs work together to steer the U-Net's encoding and decoding stages to generate high-quality denoised outputs. According to our experimental results, KU-Net performs better than traditional models, as demonstrated by its superiority on common metrics like the Structural Similarity Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR). KU-Net notably attains a PSNR of 26.60 dB at a noise level of 50, highlighting its efficacy and potential for more widespread use in image denoising.

4.
Sci Rep ; 14(1): 23489, 2024 10 08.
Article in English | MEDLINE | ID: mdl-39379448

ABSTRACT

Automated segmentation of biomedical image has been recognized as an important step in computer-aided diagnosis systems for detection of abnormalities. Despite its importance, the segmentation process remains an open challenge due to variations in color, texture, shape diversity and boundaries. Semantic segmentation often requires deeper neural networks to achieve higher accuracy, making the segmentation model more complex and slower. Due to the need to process a large number of biomedical images, more efficient and cheaper image processing techniques for accurate segmentation are needed. In this article, we present a modified deep semantic segmentation model that utilizes the backbone of EfficientNet-B3 along with UNet for reliable segmentation. We trained our model on Non-melanoma skin cancer segmentation for histopathology dataset to divide the image in 12 different classes for segmentation. Our method outperforms the existing literature with an increase in average class accuracy from 79 to 83%. Our approach also shows an increase in overall accuracy from 85 to 94%.


Subject(s)
Image Processing, Computer-Assisted , Neural Networks, Computer , Semantics , Skin Neoplasms , Skin , Humans , Skin Neoplasms/diagnostic imaging , Skin Neoplasms/pathology , Image Processing, Computer-Assisted/methods , Skin/diagnostic imaging , Skin/pathology , Deep Learning , Algorithms
5.
Sci Rep ; 14(1): 23237, 2024 10 05.
Article in English | MEDLINE | ID: mdl-39369017

ABSTRACT

In the domain of medical imaging, the advent of deep learning has marked a significant progression, particularly in the nuanced area of periodontal disease diagnosis. This study specifically targets the prevalent issue of scarce labeled data in medical imaging. We introduce a novel unsupervised few-shot learning algorithm, meticulously crafted for classifying periodontal diseases using a limited collection of dental panoramic radiographs. Our method leverages UNet architecture for generating regions of interest (RoI) from radiographs, which are then processed through a Convolutional Variational Autoencoder (CVAE). This approach is pivotal in extracting critical latent features, subsequently clustered using an advanced algorithm. This clustering is key in our methodology, enabling the assignment of labels to images indicative of periodontal diseases, thus circumventing the challenges posed by limited datasets. Our validation process, involving a comparative analysis with traditional supervised learning and standard autoencoder-based clustering, demonstrates a marked improvement in both diagnostic accuracy and efficiency. For three real-world validation datasets, our UNet-CVAE architecture achieved up to average 14% higher accuracy compared to state-of-the-art supervised models including the vision transformer model when trained with 100 labeled images. This study not only highlights the capability of unsupervised learning in overcoming data limitations but also sets a new benchmark for diagnostic methodologies in medical AI, potentially transforming practices in data-constrained scenarios.


Subject(s)
Deep Learning , Periodontal Diseases , Radiography, Panoramic , Humans , Periodontal Diseases/diagnostic imaging , Radiography, Panoramic/methods , Algorithms , Unsupervised Machine Learning , Image Processing, Computer-Assisted/methods
6.
Radiat Oncol J ; 42(3): 181-191, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39354821

ABSTRACT

PURPOSE: To generate and investigate a supervised deep learning algorithm for creating synthetic computed tomography (sCT) images from kilovoltage cone-beam computed tomography (kV-CBCT) images for adaptive radiation therapy (ART) in head and neck cancer (HNC). MATERIALS AND METHODS: This study generated the supervised U-Net deep learning model using 3,491 image pairs from planning computed tomography (pCT) and kV-CBCT datasets obtained from 40 HNC patients. The dataset was split into 80% for training and 20% for testing. The evaluation of the sCT images compared to pCT images focused on three aspects: Hounsfield units accuracy, assessed using mean absolute error (MAE) and root mean square error (RMSE); image quality, evaluated using the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) between sCT and pCT images; and dosimetric accuracy, encompassing 3D gamma passing rates for dose distribution and percentage dose difference. RESULTS: MAE, RMSE, PSNR, and SSIM showed improvements from their initial values of 53.15 ± 40.09, 153.99 ± 79.78, 47.91 ± 4.98 dB, and 0.97 ± 0.02 to 41.47 ± 30.59, 130.39 ± 78.06, 49.93 ± 6.00 dB, and 0.98 ± 0.02, respectively. Regarding dose evaluation, 3D gamma passing rates for dose distribution within sCT images under 2%/2 mm, 3%/2 mm, and 3%/3 mm criteria, yielded passing rates of 92.1% ± 3.8%, 93.8% ± 3.0%, and 96.9% ± 2.0%, respectively. The sCT images exhibited minor variations in the percentage dose distribution of the investigated target and structure volumes. However, it is worth noting that the sCT images exhibited anatomical variations when compared to the pCT images. CONCLUSION: These findings highlight the potential of the supervised U-Net deep learningmodel in generating kV-CBCT-based sCT images for ART in patients with HNC.

7.
Ultrasonics ; 145: 107479, 2024 Sep 30.
Article in English | MEDLINE | ID: mdl-39366205

ABSTRACT

In ultrasound image diagnosis, single plane-wave imaging (SPWI), which can acquire ultrasound images at more than 1000 fps, has been used to observe detailed tissue and evaluate blood flow. SPWI achieves high temporal resolution by sacrificing the spatial resolution and contrast of ultrasound images. To improve spatial resolution and contrast in SPWI, coherent plane-wave compounding (CPWC) is used to obtain high-quality ultrasound images, i.e., compound images, by coherent addition of radio frequency (RF) signals acquired by transmitting plane waves in different directions. Although CPWC produces high-quality ultrasound images, their temporal resolution is lower than that of SPWI. To address this problem, some methods have been proposed to reconstruct a ultrasound image comparable to a compound image from RF signals obtained by transmitting a small number of plane waves in different directions. These methods do not fully consider the properties of RF signals, resulting in lower image quality compared to a compound image. In this paper, we propose methods to reconstruct high-quality ultrasound images in SPWI by considering the characteristics of RF signal of a single plane wave to obtain ultrasound images with image quality comparable to CPWC. The proposed methods employ encoder-decoder models of 1D U-Net, 2D U-Net, and their combination to generate the high-quality ultrasound images by minimizing the loss that considers the point spread effect of plane waves and frequency spectrum of RF signals in training. We also create a public large-scale SPWI/CPWC dataset for developing and evaluating deep-learning methods. Through a set of experiments using the public dataset and our dataset, we demonstrate that the proposed methods can reconstruct higher-quality ultrasound images from RF signals in SPWI than conventional method.

8.
Front Oncol ; 14: 1433225, 2024.
Article in English | MEDLINE | ID: mdl-39351348

ABSTRACT

Purpose: The 3D U-Net deep neural network structure is widely employed for dose prediction in radiotherapy. However, the attention to the network depth and its impact on the accuracy and robustness of dose prediction remains inadequate. Methods: 92 cervical cancer patients who underwent Volumetric Modulated Arc Therapy (VMAT) are geometrically augmented to investigate the effects of network depth on dose prediction by training and testing three different 3D U-Net structures with depths of 3, 4, and 5. Results: For planning target volume (PTV), the differences between predicted and true values of D98, D99, and Homogeneity were statistically 1.00 ± 0.23, 0.32 ± 0.72, and -0.02 ± 0.02 for the model with a depth of 5, respectively. Compared to the other two models, these parameters were also better. For most of the organs at risk, the mean and maximum differences between the predicted values and the true values for the model with a depth of 5 were better than for the other two models. Conclusions: The results reveal that the network model with a depth of 5 exhibits superior performance, albeit at the expense of the longest training time and maximum computational memory in the three models. A small server with two NVIDIA GeForce RTX 3090 GPUs with 24 G of memory was employed for this training. For the 3D U-Net model with a depth of more than 5 cannot be supported due to insufficient training memory, the 3D U-Net neural network with a depth of 5 is the commonly used and optimal choice for small servers.

9.
Neural Netw ; 181: 106765, 2024 Sep 28.
Article in English | MEDLINE | ID: mdl-39357269

ABSTRACT

SNNs are gaining popularity in AI research as a low-power alternative in deep learning due to their sparse properties and biological interpretability. Using SNNs for dense prediction tasks is becoming an important research area. In this paper, we firstly proposed a novel modification on the conventional Spiking U-Net architecture by adjusting the firing positions of neurons. The modified network model, named Analog Spiking U-Net (AS U-Net), is capable of incorporating the Convolutional Block Attention Module (CBAM) into the domain of SNNs. This is the first successful implementation of CBAM in SNNs, which has the potential to improve SNN model's segmentation performance while decreasing information loss. Then, the proposed AS U-Net (with CBAM&ViT) is trained by direct encoding on a comprehensive dataset obtained by merging several diabetic retinal vessel segmentation datasets. Based on the experimental results, the provided SNN model achieves the highest segmentation accuracy in retinal vessel segmentation for diabetes mellitus, surpassing other SNN-based models and most ANN-based related models. In addition, under the same structure, our model demonstrates comparable performance to the ANN model. And then, the novel model achieves state-of-the-art(SOTA) results in comparative experiments when both accuracy and energy consumption are considered (Fig. 1). At the same time, the ablative analysis of CBAM further confirms its feasibility and effectiveness in SNNs, which means that a novel approach could be provided for subsequent deployment and hardware chip application. In the end, we conduct extensive generalization experiments on the same type of segmentation task (ISBI and ISIC), the more complex multi-segmentation task (Synapse), and a series of image generation tasks (MNIST, Day2night, Maps, Facades) in order to visually demonstrate the generality of the proposed method.

10.
Neural Netw ; 181: 106754, 2024 Sep 22.
Article in English | MEDLINE | ID: mdl-39362185

ABSTRACT

Accurate segmentation of thyroid nodules is essential for early screening and diagnosis, but it can be challenging due to the nodules' varying sizes and positions. To address this issue, we propose a multi-attention guided UNet (MAUNet) for thyroid nodule segmentation. We use a multi-scale cross attention (MSCA) module for initial image feature extraction. By integrating interactions between features at different scales, the impact of thyroid nodule shape and size on the segmentation results has been reduced. Additionally, we incorporate a dual attention (DA) module into the skip-connection step of the UNet network, which promotes information exchange and fusion between the encoder and decoder. To test the model's robustness and effectiveness, we conduct the extensive experiments on multi-center ultrasound images provided by 17 local hospitals. The model is trained using the federal learning mechanism to ensure privacy protection. The experimental results show that the Dice scores of the model on the data sets from the three centers are 0.908, 0.912 and 0.887, respectively. Compared to existing methods, our method demonstrates higher generalization ability on multi-center datasets and achieves better segmentation results.

11.
Int J Geogr Inf Sci ; 38(10): 2061-2082, 2024.
Article in English | MEDLINE | ID: mdl-39318700

ABSTRACT

Cartographic map generalization involves complex rules, and a full automation has still not been achieved, despite many efforts over the past few decades. Pioneering studies show that some map generalization tasks can be partially automated by deep neural networks (DNNs). However, DNNs are still used as black-box models in previous studies. We argue that integrating explainable AI (XAI) into a DL-based map generalization process can give more insights to develop and refine the DNNs by understanding what cartographic knowledge exactly is learned. Following an XAI framework for an empirical case study, visual analytics and quantitative experiments were applied to explain the importance of input features regarding the prediction of a pre-trained ResU-Net model. This experimental case study finds that the XAI-based visualization results can easily be interpreted by human experts. With the proposed XAI workflow, we further find that the DNN pays more attention to the building boundaries than the interior parts of the buildings. We thus suggest that boundary intersection over union is a better evaluation metric than commonly used intersection over union in qualifying raster-based map generalization results. Overall, this study shows the necessity and feasibility of integrating XAI as part of future DL-based map generalization development frameworks.

12.
Sensors (Basel) ; 24(17)2024 Aug 23.
Article in English | MEDLINE | ID: mdl-39275364

ABSTRACT

Different types of rural settlement agglomerations have been formed and mixed in space during the rural revitalization strategy implementation in China. Discriminating them from remote sensing images is of great significance for rural land planning and living environment improvement. Currently, there is a lack of automatic methods for obtaining information on rural settlement differentiation. In this paper, an improved encoder-decoder network structure, ASCEND-UNet, was designed based on the original UNet. It was implemented to segment and classify dispersed and clustered rural settlement buildings from high-resolution satellite images. The ASCEND-UNet model incorporated three components: firstly, the atrous spatial pyramid pooling (ASPP) multi-scale feature fusion module was added into the encoder, then the spatial and channel squeeze and excitation (scSE) block was embedded at the skip connection; thirdly, the hybrid dilated convolution (HDC) block was utilized in the decoder. In our proposed framework, the ASPP and HDC were used as multiple dilated convolution blocks to expand the receptive field by introducing a series of dilated rate convolutions. The scSE is an attention mechanism block focusing on features both in the spatial and channel dimension. A series of model comparisons and accuracy assessments with the original UNet, PSPNet, DeepLabV3+, and SegNet verified the effectiveness of our proposed model. Compared with the original UNet model, ASCEND-UNet achieved improvements of 4.67%, 2.80%, 3.73%, and 6.28% in precision, recall, F1-score and MIoU, respectively. The contributions of HDC, ASPP, and scSE modules were discussed in ablation experiments. Our proposed model obtained more accurate and stable results by integrating multiple dilated convolution blocks with an attention mechanism. This novel model enriches the automatic methods for semantic segmentation of different rural settlements from remote sensing images.

13.
Neuroimage ; 300: 120872, 2024 Oct 15.
Article in English | MEDLINE | ID: mdl-39349149

ABSTRACT

In this study, we introduce MGA-Net, a novel mask-guided attention neural network, which extends the U-net model for precision neonatal brain imaging. MGA-Net is designed to extract the brain from other structures and reconstruct high-quality brain images. The network employs a common encoder and two decoders: one for brain mask extraction and the other for brain region reconstruction. A key feature of MGA-Net is its high-level mask-guided attention module, which leverages features from the brain mask decoder to enhance image reconstruction. To enable the same encoder and decoder to process both MRI and ultrasound (US) images, MGA-Net integrates sinusoidal positional encoding. This encoding assigns distinct positional values to MRI and US images, allowing the model to effectively learn from both modalities. Consequently, features learned from a single modality can aid in learning a modality with less available data, such as US. We extensively validated the proposed MGA-Net on diverse and independent datasets from varied clinical settings and neonatal age groups. The metrics used for assessment included the DICE similarity coefficient, recall, and accuracy for image segmentation; structural similarity for image reconstruction; and root mean squared error for total brain volume estimation from 3D ultrasound images. Our results demonstrate that MGA-Net significantly outperforms traditional methods, offering superior performance in brain extraction and segmentation while achieving high precision in image reconstruction and volumetric analysis. Thus, MGA-Net represents a robust and effective preprocessing tool for MRI and 3D ultrasound images, marking a significant advance in neuroimaging that enhances both research and clinical diagnostics in the neonatal period and beyond.


Subject(s)
Brain , Magnetic Resonance Imaging , Neural Networks, Computer , Neuroimaging , Humans , Infant, Newborn , Magnetic Resonance Imaging/methods , Brain/diagnostic imaging , Neuroimaging/methods , Female , Image Processing, Computer-Assisted/methods , Male
14.
Sci Rep ; 14(1): 22422, 2024 09 28.
Article in English | MEDLINE | ID: mdl-39341859

ABSTRACT

Breast cancer, a prevalent and life-threatening disease, necessitates early detection for the effective intervention and the improved patient health outcomes. This paper focuses on the critical problem of identifying breast cancer using a model called Attention U-Net. The model is utilized on the Breast Ultrasound Image Dataset (BUSI), comprising 780 breast images. The images are categorized into three distinct groups: 437 cases classified as benign, 210 cases classified as malignant, and 133 cases classified as normal. The proposed model leverages the attention-driven U-Net's encoder blocks to capture hierarchical features effectively. The model comprises four decoder blocks which is a pivotal component in the U-Net architecture, responsible for expanding the encoded feature representation obtained from the encoder block and for reconstructing spatial information. Four attention gates are incorporated strategically to enhance feature localization during decoding, showcasing a sophisticated design that facilitates accurate segmentation of breast tumors in ultrasound images. It displays its efficacy in accurately delineating and segregating tumor borders. The experimental findings demonstrate outstanding performance, achieving an overall accuracy of 0.98, precision of 0.97, recall of 0.90, and a dice score of 0.92. It demonstrates its effectiveness in precisely defining and separating tumor boundaries. This research aims to make automated breast cancer segmentation algorithms by emphasizing the importance of early detection in boosting diagnostic capabilities and enabling prompt and targeted medical interventions.


Subject(s)
Breast Neoplasms , Humans , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Female , Ultrasonography, Mammary/methods , Algorithms , Image Interpretation, Computer-Assisted/methods , Databases, Factual , Image Processing, Computer-Assisted/methods
15.
Front Artif Intell ; 7: 1376546, 2024.
Article in English | MEDLINE | ID: mdl-39315244

ABSTRACT

Background: This study delves into the crucial domain of sperm segmentation, a pivotal component of male infertility diagnosis. It explores the efficacy of diverse architectural configurations coupled with various encoders, leveraging frames from the VISEM dataset for evaluation. Methods: The pursuit of automated sperm segmentation led to the examination of multiple deep learning architectures, each paired with distinct encoders. Extensive experimentation was conducted on the VISEM dataset to assess their performance. Results: Our study evaluated various deep learning architectures with different encoders for sperm segmentation using the VISEM dataset. While each model configuration exhibited distinct strengths and weaknesses, UNet++ with ResNet34 emerged as a top-performing model, demonstrating exceptional accuracy in distinguishing sperm cells from non-sperm cells. However, challenges persist in accurately identifying closely adjacent sperm cells. These findings provide valuable insights for improving automated sperm segmentation in male infertility diagnosis. Discussion: The study underscores the significance of selecting appropriate model combinations based on specific diagnostic requirements. It also highlights the challenges related to distinguishing closely adjacent sperm cells. Conclusion: This research advances the field of automated sperm segmentation for male infertility diagnosis, showcasing the potential of deep learning techniques. Future work should aim to enhance accuracy in scenarios involving close proximity between sperm cells, ultimately improving clinical sperm analysis.

16.
Comput Biol Med ; 182: 109207, 2024 Sep 27.
Article in English | MEDLINE | ID: mdl-39341115

ABSTRACT

Precise estimations of RNA secondary structures have the potential to reveal the various roles that non-coding RNAs play in regulating cellular activity. However, the mainstay of traditional RNA secondary structure prediction methods relies on thermos-dynamic models via free energy minimization, a laborious process that requires a lot of prior knowledge. Here, RNA secondary structure prediction using Wfold, an end-to-end deep learning-based approach, is suggested. Wfold is trained directly on annotated data and base-pairing criteria. It makes use of an image-like representation of RNA sequences, which an enhanced U-net incorporated with a transformer encoder can process effectively. Wfold eventually increases the accuracy of RNA secondary structure prediction by combining the benefits of self-attention mechanism's mining of long-range information with U-net's ability to gather local information. We compare Wfold's performance using RNA datasets that are within and across families. When trained and evaluated on different RNA families, it achieves a similar performance as the traditional methods, but dramatically outperforms the state-of-the-art methods on within-family datasets. Moreover, Wfold can also reliably forecast pseudoknots. The findings imply that Wfold may be useful for improving sequence alignment, functional annotations, and RNA structure modeling.

17.
Int J Neural Syst ; : 2450068, 2024 Sep 30.
Article in English | MEDLINE | ID: mdl-39343431

ABSTRACT

With the rapid advancement of deep learning, computer-aided diagnosis and treatment have become crucial in medicine. UNet is a widely used architecture for medical image segmentation, and various methods for improving UNet have been extensively explored. One popular approach is incorporating transformers, though their quadratic computational complexity poses challenges. Recently, State-Space Models (SSMs), exemplified by Mamba, have gained significant attention as a promising alternative due to their linear computational complexity. Another approach, neural memory Ordinary Differential Equations (nmODEs), exhibits similar principles and achieves good results. In this paper, we explore the respective strengths and weaknesses of nmODEs and SSMs and propose a novel architecture, the nmSSM decoder, which combines the advantages of both approaches. This architecture possesses powerful nonlinear representation capabilities while retaining the ability to preserve input and process global information. We construct nmSSM-UNet using the nmSSM decoder and conduct comprehensive experiments on the PH2, ISIC2018, and BU-COCO datasets to validate its effectiveness in medical image segmentation. The results demonstrate the promising application value of nmSSM-UNet. Additionally, we conducted ablation experiments to verify the effectiveness of our proposed improvements on SSMs and nmODEs.

18.
Diagnostics (Basel) ; 14(18)2024 Sep 23.
Article in English | MEDLINE | ID: mdl-39335778

ABSTRACT

Background/Objective: This study aims to utilize advanced artificial intelligence (AI) image recog-nition technologies to establish a robust system for identifying features in lung computed tomog-raphy (CT) scans, thereby detecting respiratory infections such as SARS-CoV-2 pneumonia. Spe-cifically, the research focuses on developing a new model called Residual-Dense-Attention Gates U-Net (RDAG U-Net) to improve accuracy and efficiency in identification. Methods: This study employed Attention U-Net, Attention Res U-Net, and the newly developed RDAG U-Net model. RDAG U-Net extends the U-Net architecture by incorporating ResBlock and DenseBlock modules in the encoder to retain training parameters and reduce computation time. The training dataset in-cludes 3,520 CT scans from an open database, augmented to 10,560 samples through data en-hancement techniques. The research also focused on optimizing convolutional architectures, image preprocessing, interpolation methods, data management, and extensive fine-tuning of training parameters and neural network modules. Result: The RDAG U-Net model achieved an outstanding accuracy of 93.29% in identifying pulmonary lesions, with a 45% reduction in computation time compared to other models. The study demonstrated that RDAG U-Net performed stably during training and exhibited good generalization capability by evaluating loss values, model-predicted lesion annotations, and validation-epoch curves. Furthermore, using ITK-Snap to convert 2D pre-dictions into 3D lung and lesion segmentation models, the results delineated lesion contours, en-hancing interpretability. Conclusion: The RDAG U-Net model showed significant improvements in accuracy and efficiency in the analysis of CT images for SARS-CoV-2 pneumonia, achieving a 93.29% recognition accuracy and reducing computation time by 45% compared to other models. These results indicate the potential of the RDAG U-Net model in clinical applications, as it can accelerate the detection of pulmonary lesions and effectively enhance diagnostic accuracy. Additionally, the 2D and 3D visualization results allow physicians to understand lesions' morphology and distribution better, strengthening decision support capabilities and providing valuable medical diagnosis and treatment planning tools.

19.
J Pers Med ; 14(9)2024 Sep 15.
Article in English | MEDLINE | ID: mdl-39338233

ABSTRACT

Adaptive radiotherapy (ART) workflows are increasingly adopted to achieve dose escalation and tissue sparing under dynamic anatomical conditions. However, recontouring and time constraints hinder the implementation of real-time ART workflows. Various auto-segmentation methods, including deformable image registration, atlas-based segmentation, and deep learning-based segmentation (DLS), have been developed to address these challenges. Despite the potential of DLS methods, clinical implementation remains difficult due to the need for large, high-quality datasets to ensure model generalizability. This study introduces an InterVision framework for segmentation. The InterVision framework can interpolate or create intermediate visuals between existing images to generate specific patient characteristics. The InterVision model is trained in two steps: (1) generating a general model using the dataset, and (2) tuning the general model using the dataset generated from the InterVision framework. The InterVision framework generates intermediate images between existing patient image slides using deformable vectors, effectively capturing unique patient characteristics. By creating a more comprehensive dataset that reflects these individual characteristics, the InterVision model demonstrates the ability to produce more accurate contours compared to general models. Models are evaluated using the volumetric dice similarity coefficient (VDSC) and the Hausdorff distance 95% (HD95%) for 18 structures in 20 test patients. As a result, the Dice score was 0.81 ± 0.05 for the general model, 0.82 ± 0.04 for the general fine-tuning model, and 0.85 ± 0.03 for the InterVision model. The Hausdorff distance was 3.06 ± 1.13 for the general model, 2.81 ± 0.77 for the general fine-tuning model, and 2.52 ± 0.50 for the InterVision model. The InterVision model showed the best performance compared to the general model. The InterVision framework presents a versatile approach adaptable to various tasks where prior information is accessible, such as in ART settings. This capability is particularly valuable for accurately predicting complex organs and targets that pose challenges for traditional deep learning algorithms.

20.
Sensors (Basel) ; 24(18)2024 Sep 19.
Article in English | MEDLINE | ID: mdl-39338791

ABSTRACT

There are two widely used methods to measure the cardiac cycle and obtain heart rate measurements: the electrocardiogram (ECG) and the photoplethysmogram (PPG). The sensors used in these methods have gained great popularity in wearable devices, which have extended cardiac monitoring beyond the hospital environment. However, the continuous monitoring of ECG signals via mobile devices is challenging, as it requires users to keep their fingers pressed on the device during data collection, making it unfeasible in the long term. On the other hand, the PPG does not contain this limitation. However, the medical knowledge to diagnose these anomalies from this sign is limited by the need for familiarity, since the ECG is studied and used in the literature as the gold standard. To minimize this problem, this work proposes a method, PPG2ECG, that uses the correlation between the domains of PPG and ECG signals to infer from the PPG signal the waveform of the ECG signal. PPG2ECG consists of mapping between domains by applying a set of convolution filters, learning to transform a PPG input signal into an ECG output signal using a U-net inception neural network architecture. We assessed our proposed method using two evaluation strategies based on personalized and generalized models and achieved mean error values of 0.015 and 0.026, respectively. Our method overcomes the limitations of previous approaches by providing an accurate and feasible method for continuous monitoring of ECG signals through PPG signals. The short distances between the infer-red ECG and the original ECG demonstrate the feasibility and potential of our method to assist in the early identification of heart diseases.


Subject(s)
Electrocardiography , Heart Rate , Neural Networks, Computer , Photoplethysmography , Signal Processing, Computer-Assisted , Humans , Electrocardiography/methods , Photoplethysmography/methods , Heart Rate/physiology , Algorithms , Wearable Electronic Devices
SELECTION OF CITATIONS
SEARCH DETAIL