Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 146
Filter
1.
Article in English | MEDLINE | ID: mdl-38957573

ABSTRACT

Medical image auto-segmentation techniques are basic and critical for numerous image-based analysis applications that play an important role in developing advanced and personalized medicine. Compared with manual segmentations, auto-segmentations are expected to contribute to a more efficient clinical routine and workflow by requiring fewer human interventions or revisions to auto-segmentations. However, current auto-segmentation methods are usually developed with the help of some popular segmentation metrics that do not directly consider human correction behavior. Dice Coefficient (DC) focuses on the truly-segmented areas, while Hausdorff Distance (HD) only measures the maximal distance between the auto-segmentation boundary with the ground truth boundary. Boundary length-based metrics such as surface DC (surDC) and Added Path Length (APL) try to distinguish truly-predicted boundary pixels and wrong ones. It is uncertain if these metrics can reliably indicate the required manual mending effort for application in segmentation research. Therefore, in this paper, the potential use of the above four metrics, as well as a novel metric called Mendability Index (MI), to predict the human correction effort is studied with linear and support vector regression models. 265 3D computed tomography (CT) samples for 3 objects of interest from 3 institutions with corresponding auto-segmentations and ground truth segmentations are utilized to train and test the prediction models. The five-fold cross-validation experiments demonstrate that meaningful human effort prediction can be achieved using segmentation metrics with varying prediction errors for different objects. The improved variant of MI, called MIhd, generally shows the best prediction performance, suggesting its potential to indicate reliably the clinical value of auto-segmentations.

2.
Comput Med Imaging Graph ; 116: 102403, 2024 Jun 02.
Article in English | MEDLINE | ID: mdl-38878632

ABSTRACT

BACKGROUND AND OBJECTIVES: Bio-medical image segmentation models typically attempt to predict one segmentation that resembles a ground-truth structure as closely as possible. However, as medical images are not perfect representations of anatomy, obtaining this ground truth is not possible. A surrogate commonly used is to have multiple expert observers define the same structure for a dataset. When multiple observers define the same structure on the same image there can be significant differences depending on the structure, image quality/modality and the region being defined. It is often desirable to estimate this type of aleatoric uncertainty in a segmentation model to help understand the region in which the true structure is likely to be positioned. Furthermore, obtaining these datasets is resource intensive so training such models using limited data may be required. With a small dataset size, differing patient anatomy is likely not well represented causing epistemic uncertainty which should also be estimated so it can be determined for which cases the model is effective or not. METHODS: We use a 3D probabilistic U-Net to train a model from which several segmentations can be sampled to estimate the range of uncertainty seen between multiple observers. To ensure that regions where observers disagree most are emphasised in model training, we expand the Generalised Evidence Lower Bound (ELBO) with a Constrained Optimisation (GECO) loss function with an additional contour loss term to give attention to this region. Ensemble and Monte-Carlo dropout (MCDO) uncertainty quantification methods are used during inference to estimate model confidence on an unseen case. We apply our methodology to two radiotherapy clinical trial datasets, a gastric cancer trial (TOPGEAR, TROG 08.08) and a post-prostatectomy prostate cancer trial (RAVES, TROG 08.03). Each dataset contains only 10 cases each for model development to segment the clinical target volume (CTV) which was defined by multiple observers on each case. An additional 50 cases are available as a hold-out dataset for each trial which had only one observer define the CTV structure on each case. Up to 50 samples were generated using the probabilistic model for each case in the hold-out dataset. To assess performance, each manually defined structure was matched to the closest matching sampled segmentation based on commonly used metrics. RESULTS: The TOPGEAR CTV model achieved a Dice Similarity Coefficient (DSC) and Surface DSC (sDSC) of 0.7 and 0.43 respectively with the RAVES model achieving 0.75 and 0.71 respectively. Segmentation quality across cases in the hold-out datasets was variable however both the ensemble and MCDO uncertainty estimation approaches were able to accurately estimate model confidence with a p-value < 0.001 for both TOPGEAR and RAVES when comparing the DSC using the Pearson correlation coefficient. CONCLUSIONS: We demonstrated that training auto-segmentation models which can estimate aleatoric and epistemic uncertainty using limited datasets is possible. Having the model estimate prediction confidence is important to understand for which unseen cases a model is likely to be useful.

3.
Clin Transl Radiat Oncol ; 47: 100796, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38884004

ABSTRACT

Purpose: Aim of the present study is to characterize a deep learning-based auto-segmentation software (DL) for prostate cone beam computed tomography (CBCT) images and to evaluate its applicability in clinical adaptive radiation therapy routine. Materials and methods: Ten patients, who received exclusive radiation therapy with definitive intent on the prostate gland and seminal vesicles, were selected. Femoral heads, bladder, rectum, prostate, and seminal vesicles were retrospectively contoured by four different expert radiation oncologists on patients CBCT, acquired during treatment. Consensus contours (CC) were generated starting from these data and compared with those created by DL with different algorithms, trained on CBCT (DL-CBCT) or computed tomography (DL-CT). Dice similarity coefficient (DSC), centre of mass (COM) shift and volume relative variation (VRV) were chosen as comparison metrics. Since no tolerance limit can be defined, results were also compared with the inter-operator variability (IOV), using the same metrics. Results: The best agreement between DL and CC was observed for femoral heads (DSC of 0.96 for both DL-CBCT and DL-CT). Performance worsened for low-contrast soft tissue organs: the worst results were found for seminal vesicles (DSC of 0.70 and 0.59 for DL-CBCT and DL-CT, respectively). The analysis shows that it is appropriate to use algorithms trained on the specific imaging modality. Furthermore, the statistical analysis showed that, for almost all considered structures, there is no significant difference between DL-CBCT and human operator in terms of IOV. Conclusions: The accuracy of DL-CBCT is in accordance with CC; its use in clinical practice is justified by the comparison with the inter-operator variability.

4.
Dose Response ; 22(2): 15593258241263687, 2024.
Article in English | MEDLINE | ID: mdl-38912333

ABSTRACT

Background and Purpose: Artificial intelligence (AI) is a technique which tries to think like humans and mimic human behaviors. It has been considered as an alternative in a lot of human-dependent steps in radiotherapy (RT), since the human participation is a principal uncertainty source in RT. The aim of this work is to provide a systematic summary of the current literature on AI application for RT, and to clarify its role for RT practice in terms of clinical views. Materials and Methods: A systematic literature search of PubMed and Google Scholar was performed to identify original articles involving the AI applications in RT from the inception to 2022. Studies were included if they reported original data and explored the clinical applications of AI in RT. Results: The selected studies were categorized into three aspects of RT: organ and lesion segmentation, treatment planning and quality assurance. For each aspect, this review discussed how these AI tools could be involved in the RT protocol. Conclusions: Our study revealed that AI was a potential alternative for the human-dependent steps in the complex process of RT.

5.
Med Phys ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38896829

ABSTRACT

BACKGROUND: Head and neck (HN) gross tumor volume (GTV) auto-segmentation is challenging due to the morphological complexity and low image contrast of targets. Multi-modality images, including computed tomography (CT) and positron emission tomography (PET), are used in the routine clinic to assist radiation oncologists for accurate GTV delineation. However, the availability of PET imaging may not always be guaranteed. PURPOSE: To develop a deep learning segmentation framework for automated GTV delineation of HN cancers using a combination of PET/CT images, while addressing the challenge of missing PET data. METHODS: Two datasets were included for this study: Dataset I: 524 (training) and 359 (testing) oropharyngeal cancer patients from different institutions with their PET/CT pairs provided by the HECKTOR Challenge; Dataset II: 90 HN patients(testing) from a local institution with their planning CT, PET/CT pairs. To handle potentially missing PET images, a model training strategy named the "Blank Channel" method was implemented. To simulate the absence of a PET image, a blank array with the same dimensions as the CT image was generated to meet the dual-channel input requirement of the deep learning model. During the model training process, the model was randomly presented with either a real PET/CT pair or a blank/CT pair. This allowed the model to learn the relationship between the CT image and the corresponding GTV delineation based on available modalities. As a result, our model had the ability to handle flexible inputs during prediction, making it suitable for cases where PET images are missing. To evaluate the performance of our proposed model, we trained it using training patients from Dataset I and tested it with Dataset II. We compared our model (Model 1) with two other models which were trained for specific modality segmentations: Model 2 trained with only CT images, and Model 3 trained with real PET/CT pairs. The performance of the models was evaluated using quantitative metrics, including Dice similarity coefficient (DSC), mean surface distance (MSD), and 95% Hausdorff Distance (HD95). In addition, we evaluated our Model 1 and Model 3 using the 359 test cases in Dataset I. RESULTS: Our proposed model(Model 1) achieved promising results for GTV auto-segmentation using PET/CT images, with the flexibility of missing PET images. Specifically, when assessed with only CT images in Dataset II, Model 1 achieved DSC of 0.56 ± 0.16, MSD of 3.4 ± 2.1 mm, and HD95 of 13.9 ± 7.6 mm. When the PET images were included, the performance of our model was improved to DSC of 0.62 ± 0.14, MSD of 2.8 ± 1.7 mm, and HD95 of 10.5 ± 6.5 mm. These results are comparable to those achieved by Model 2 and Model 3, illustrating Model 1's effectiveness in utilizing flexible input modalities. Further analysis using the test dataset from Dataset I showed that Model 1 achieved an average DSC of 0.77, surpassing the overall average DSC of 0.72 among all participants in the HECKTOR Challenge. CONCLUSIONS: We successfully refined a multi-modal segmentation tool for accurate GTV delineation for HN cancer. Our method addressed the issue of missing PET images by allowing flexible data input, thereby providing a practical solution for clinical settings where access to PET imaging may be limited.

6.
Phys Med ; 123: 103393, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38852363

ABSTRACT

BACKGROUND AND PURPOSE: One of the current roadblocks to the widespread use of Total Marrow Irradiation (TMI) and Total Marrow and Lymphoid Irradiation (TMLI) is the challenging difficulties in tumor target contouring workflow. This study aims to develop a hybrid neural network model that promotes accurate, automatic, and rapid segmentation of multi-class clinical target volumes. MATERIALS AND METHODS: Patients who underwent TMI and TMLI from January 2018 to May 2022 were included. Two independent oncologists manually contoured eight target volumes for patients on CT images. A novel Dual-Encoder Alignment Network (DEA-Net) was developed and trained using 46 patients from one internal institution and independently evaluated on a total of 39 internal and external patients. Performance was evaluated on accuracy metrics and delineation time. RESULTS: The DEA-Net achieved a mean dice similarity coefficient of 90.1 % ± 1.8 % for internal testing dataset (23 patients) and 91.1 % ± 2.5 % for external testing dataset (16 patients). The 95 % Hausdorff distance and average symmetric surface distance were 2.04 ± 0.62 mm and 0.57 ± 0.11 mm for internal testing dataset, and 2.17 ± 0.68 mm, and 0.57 ± 0.20 mm for external testing dataset, respectively, outperforming most of existing state-of-the-art methods. In addition, the automatic segmentation workflow reduced delineation time by 98 % compared to the conventional manual contouring process (mean 173 ± 29 s vs. 12168 ± 1690 s; P < 0.001). Ablation study validate the effectiveness of hybrid structures. CONCLUSION: The proposed deep learning framework achieved comparable or superior target volume delineation accuracy, significantly accelerating the radiotherapy planning process.


Subject(s)
Bone Marrow , Deep Learning , Radiotherapy Planning, Computer-Assisted , Humans , Bone Marrow/radiation effects , Bone Marrow/diagnostic imaging , Radiotherapy Planning, Computer-Assisted/methods , Lymphatic Irradiation/methods , Image Processing, Computer-Assisted/methods , Tomography, X-Ray Computed , Male , Female
7.
In Vivo ; 38(4): 1712-1718, 2024.
Article in English | MEDLINE | ID: mdl-38936930

ABSTRACT

BACKGROUND/AIM: Intensity-modulated radiation therapy can deliver a highly conformal dose to a target while minimizing the dose to the organs at risk (OARs). Delineating the contours of OARs is time-consuming, and various automatic contouring software programs have been employed to reduce the delineation time. However, some software operations are manual, and further reduction in time is possible. This study aimed to automate running atlas-based auto-segmentation (ABAS) and software operations using a scripting function, thereby reducing work time. MATERIALS AND METHODS: Dice coefficient and Hausdorff distance were used to determine geometric accuracy. The manual delineation, automatic delineation, and modification times were measured. While modifying the contours, the degree of subjective correction was rated on a four-point scale. RESULTS: The model exhibited generally good geometric accuracy. However, some OARs, such as the chiasm, optic nerve, retina, lens, and brain require improvement. The average contour delineation time was reduced from 57 to 29 min (p<0.05). The subjective revision degree results indicated that all OARs required minor modifications; only the submandibular gland, thyroid, and esophagus were rated as modified from scratch. CONCLUSION: The ABAS model and scripted automation in head and neck cancer reduced the work time and software operations. The time can be further reduced by improving contour accuracy.


Subject(s)
Head and Neck Neoplasms , Organs at Risk , Radiotherapy Planning, Computer-Assisted , Radiotherapy, Intensity-Modulated , Software , Humans , Head and Neck Neoplasms/radiotherapy , Radiotherapy Planning, Computer-Assisted/methods , Radiotherapy, Intensity-Modulated/methods , Radiotherapy Dosage , Algorithms , Image Processing, Computer-Assisted/methods
9.
Phys Med Biol ; 69(11)2024 May 29.
Article in English | MEDLINE | ID: mdl-38663411

ABSTRACT

Objective. Deep-learning networks for super-resolution (SR) reconstruction enhance the spatial-resolution of 3D magnetic resonance imaging (MRI) for MR-guided radiotherapy (MRgRT). However, variations between MRI scanners and patients impact the quality of SR for real-time 3D low-resolution (LR) cine MRI. In this study, we present a personalized super-resolution (psSR) network that incorporates transfer-learning to overcome the challenges in inter-scanner SR of 3D cine MRI.Approach: Development of the proposed psSR network comprises two-stages: (1) a cohort-specific SR (csSR) network using clinical patient datasets, and (2) a psSR network using transfer-learning to target datasets. The csSR network was developed by training on breath-hold and respiratory-gated high-resolution (HR) 3D MRIs and their k-space down-sampled LR MRIs from 53 thoracoabdominal patients scanned at 1.5 T. The psSR network was developed through transfer-learning to retrain the csSR network using a single breath-hold HR MRI and a corresponding 3D cine MRI from 5 healthy volunteers scanned at 0.55 T. Image quality was evaluated using the peak-signal-noise-ratio (PSNR) and the structure-similarity-index-measure (SSIM). The clinical feasibility was assessed by liver contouring on the psSR MRI using an auto-segmentation network and quantified using the dice-similarity-coefficient (DSC).Results. Mean PSNR and SSIM values of psSR MRIs were increased by 57.2% (13.8-21.7) and 94.7% (0.38-0.74) compared to cine MRIs, with the reference 0.55 T breath-hold HR MRI. In the contour evaluation, DSC was increased by 15% (0.79-0.91). Average time consumed for transfer-learning was 90 s, psSR was 4.51 ms per volume, and auto-segmentation was 210 ms, respectively.Significance. The proposed psSR reconstruction substantially increased image and segmentation quality of cine MRI in an average of 215 ms across the scanners and patients with less than 2 min of prerequisite transfer-learning. This approach would be effective in overcoming cohort- and scanner-dependency of deep-learning for MRgRT.


Subject(s)
Imaging, Three-Dimensional , Magnetic Resonance Imaging, Cine , Humans , Magnetic Resonance Imaging, Cine/methods , Imaging, Three-Dimensional/methods , Radiotherapy, Image-Guided/methods , Deep Learning
10.
BJR Artif Intell ; 1(1): ubae004, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38476956

ABSTRACT

Objectives: Auto-segmentation promises greater speed and lower inter-reader variability than manual segmentations in radiation oncology clinical practice. This study aims to implement and evaluate the accuracy of the auto-segmentation algorithm, "Masked Image modeling using the vision Transformers (SMIT)," for neck nodal metastases on longitudinal T2-weighted (T2w) MR images in oropharyngeal squamous cell carcinoma (OPSCC) patients. Methods: This prospective clinical trial study included 123 human papillomaviruses (HPV-positive [+]) related OSPCC patients who received concurrent chemoradiotherapy. T2w MR images were acquired on 3 T at pre-treatment (Tx), week 0, and intra-Tx weeks (1-3). Manual delineations of metastatic neck nodes from 123 OPSCC patients were used for the SMIT auto-segmentation, and total tumor volumes were calculated. Standard statistical analyses compared contour volumes from SMIT vs manual segmentation (Wilcoxon signed-rank test [WSRT]), and Spearman's rank correlation coefficients (ρ) were computed. Segmentation accuracy was evaluated on the test data set using the dice similarity coefficient (DSC) metric value. P-values <0.05 were considered significant. Results: No significant difference in manual and SMIT delineated tumor volume at pre-Tx (8.68 ± 7.15 vs 8.38 ± 7.01 cm3, P = 0.26 [WSRT]), and the Bland-Altman method established the limits of agreement as -1.71 to 2.31 cm3, with a mean difference of 0.30 cm3. SMIT model and manually delineated tumor volume estimates were highly correlated (ρ = 0.84-0.96, P < 0.001). The mean DSC metric values were 0.86, 0.85, 0.77, and 0.79 at the pre-Tx and intra-Tx weeks (1-3), respectively. Conclusions: The SMIT algorithm provides sufficient segmentation accuracy for oncological applications in HPV+ OPSCC. Advances in knowledge: First evaluation of auto-segmentation with SMIT using longitudinal T2w MRI in HPV+ OPSCC.

11.
J Appl Clin Med Phys ; 25(3): e14297, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38373289

ABSTRACT

PURPOSE: Deep learning-based auto-segmentation algorithms can improve clinical workflow by defining accurate regions of interest while reducing manual labor. Over the past decade, convolutional neural networks (CNNs) have become prominent in medical image segmentation applications. However, CNNs have limitations in learning long-range spatial dependencies due to the locality of the convolutional layers. Transformers were introduced to address this challenge. In transformers with self-attention mechanism, even the first layer of information processing makes connections between distant image locations. Our paper presents a novel framework that bridges these two unique techniques, CNNs and transformers, to segment the gross tumor volume (GTV) accurately and efficiently in computed tomography (CT) images of non-small cell-lung cancer (NSCLC) patients. METHODS: Under this framework, input of multiple resolution images was used with multi-depth backbones to retain the benefits of high-resolution and low-resolution images in the deep learning architecture. Furthermore, a deformable transformer was utilized to learn the long-range dependency on the extracted features. To reduce computational complexity and to efficiently process multi-scale, multi-depth, high-resolution 3D images, this transformer pays attention to small key positions, which were identified by a self-attention mechanism. We evaluated the performance of the proposed framework on a NSCLC dataset which contains 563 training images and 113 test images. Our novel deep learning algorithm was benchmarked against five other similar deep learning models. RESULTS: The experimental results indicate that our proposed framework outperforms other CNN-based, transformer-based, and hybrid methods in terms of Dice score (0.92) and Hausdorff Distance (1.33). Therefore, our proposed model could potentially improve the efficiency of auto-segmentation of early-stage NSCLC during the clinical workflow. This type of framework may potentially facilitate online adaptive radiotherapy, where an efficient auto-segmentation workflow is required. CONCLUSIONS: Our deep learning framework, based on CNN and transformer, performs auto-segmentation efficiently and could potentially assist clinical radiotherapy workflow.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Deep Learning , Lung Neoplasms , Humans , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/radiotherapy , Tomography, X-Ray Computed , Neural Networks, Computer , Algorithms , Carcinoma, Non-Small-Cell Lung/diagnostic imaging , Carcinoma, Non-Small-Cell Lung/radiotherapy , Image Processing, Computer-Assisted/methods
12.
J Appl Clin Med Phys ; : e14296, 2024 Feb 22.
Article in English | MEDLINE | ID: mdl-38386963

ABSTRACT

BACKGROUND AND PURPOSE: In radiotherapy, magnetic resonance (MR) imaging has higher contrast for soft tissues compared to computed tomography (CT) scanning and does not emit radiation. However, manual annotation of the deep learning-based automatic organ-at-risk (OAR) delineation algorithms is expensive, making the collection of large-high-quality annotated datasets a challenge. Therefore, we proposed the low-cost semi-supervised OAR segmentation method using small pelvic MR image annotations. METHODS: We trained a deep learning-based segmentation model using 116 sets of MR images from 116 patients. The bladder, femoral heads, rectum, and small intestine were selected as OAR regions. To generate the training set, we utilized a semi-supervised method and ensemble learning techniques. Additionally, we employed a post-processing algorithm to correct the self-annotation data. Both 2D and 3D auto-segmentation networks were evaluated for their performance. Furthermore, we evaluated the performance of semi-supervised method for 50 labeled data and only 10 labeled data. RESULTS: The Dice similarity coefficient (DSC) of the bladder, femoral heads, rectum and small intestine between segmentation results and reference masks is 0.954, 0.984, 0.908, 0.852 only using self-annotation and post-processing methods of 2D segmentation model. The DSC of corresponding OARs is 0.871, 0.975, 0.975, 0.783, 0.724 using 3D segmentation network, 0.896, 0.984, 0.890, 0.828 using 2D segmentation network and common supervised method. CONCLUSION: The outcomes of our study demonstrate that it is possible to train a multi-OAR segmentation model using small annotation samples and additional unlabeled data. To effectively annotate the dataset, ensemble learning and post-processing methods were employed. Additionally, when dealing with anisotropy and limited sample sizes, the 2D model outperformed the 3D model in terms of performance.

13.
J Appl Clin Med Phys ; 25(6): e14273, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38263866

ABSTRACT

PURPOSE: Artificial intelligence (AI) based commercial software can be used to automatically delineate organs at risk (OAR), with potential for efficiency savings in the radiotherapy treatment planning pathway, and reduction of inter- and intra-observer variability. There has been little research investigating gross failure rates and failure modes of such systems. METHOD: 50 head and neck (H&N) patient data sets with "gold standard" contours were compared to AI-generated contours to produce expected mean and standard deviation values for the Dice Similarity Coefficient (DSC), for four common H&N OARs (brainstem, mandible, left and right parotid). An AI-based commercial system was applied to 500 H&N patients. AI-generated contours were compared to manual contours, outlined by an expert human, and a gross failure was set at three standard deviations below the expected mean DSC. Failures were inspected to assess reason for failure of the AI-based system with failures relating to suboptimal manual contouring censored. True failures were classified into 4 sub-types (setup position, anatomy, image artefacts and unknown). RESULTS: There were 24 true failures of the AI-based commercial software, a gross failure rate of 1.2%. Fifteen failures were due to patient anatomy, four were due to dental image artefacts, three were due to patient position and two were unknown. True failure rates by OAR were 0.4% (brainstem), 2.2% (mandible), 1.4% (left parotid) and 0.8% (right parotid). CONCLUSION: True failures of the AI-based system were predominantly associated with a non-standard element within the CT scan. It is likely that these non-standard elements were the reason for the gross failure, and suggests that patient datasets used to train the AI model did not contain sufficient heterogeneity of data. Regardless of the reasons for failure, the true failure rate for the AI-based system in the H&N region for the OARs investigated was low (∼1%).


Subject(s)
Algorithms , Artificial Intelligence , Head and Neck Neoplasms , Organs at Risk , Radiotherapy Dosage , Radiotherapy Planning, Computer-Assisted , Radiotherapy, Intensity-Modulated , Humans , Head and Neck Neoplasms/radiotherapy , Head and Neck Neoplasms/diagnostic imaging , Radiotherapy Planning, Computer-Assisted/methods , Organs at Risk/radiation effects , Radiotherapy, Intensity-Modulated/methods , Software , Image Processing, Computer-Assisted/methods , Tomography, X-Ray Computed/methods
14.
Radiat Oncol ; 19(1): 3, 2024 Jan 08.
Article in English | MEDLINE | ID: mdl-38191431

ABSTRACT

OBJECTIVES: Deep learning-based auto-segmentation of head and neck cancer (HNC) tumors is expected to have better reproducibility than manual delineation. Positron emission tomography (PET) and computed tomography (CT) are commonly used in tumor segmentation. However, current methods still face challenges in handling whole-body scans where a manual selection of a bounding box may be required. Moreover, different institutions might still apply different guidelines for tumor delineation. This study aimed at exploring the auto-localization and segmentation of HNC tumors from entire PET/CT scans and investigating the transferability of trained baseline models to external real world cohorts. METHODS: We employed 2D Retina Unet to find HNC tumors from whole-body PET/CT and utilized a regular Unet to segment the union of the tumor and involved lymph nodes. In comparison, 2D/3D Retina Unets were also implemented to localize and segment the same target in an end-to-end manner. The segmentation performance was evaluated via Dice similarity coefficient (DSC) and Hausdorff distance 95th percentile (HD95). Delineated PET/CT scans from the HECKTOR challenge were used to train the baseline models by 5-fold cross-validation. Another 271 delineated PET/CTs from three different institutions (MAASTRO, CRO, BERLIN) were used for external testing. Finally, facility-specific transfer learning was applied to investigate the improvement of segmentation performance against baseline models. RESULTS: Encouraging localization results were observed, achieving a maximum omnidirectional tumor center difference lower than 6.8 cm for external testing. The three baseline models yielded similar averaged cross-validation (CV) results with a DSC in a range of 0.71-0.75, while the averaged CV HD95 was 8.6, 10.7 and 9.8 mm for the regular Unet, 2D and 3D Retina Unets, respectively. More than a 10% drop in DSC and a 40% increase in HD95 were observed if the baseline models were tested on the three external cohorts directly. After the facility-specific training, an improvement in external testing was observed for all models. The regular Unet had the best DSC (0.70) for the MAASTRO cohort, and the best HD95 (7.8 and 7.9 mm) in the MAASTRO and CRO cohorts. The 2D Retina Unet had the best DSC (0.76 and 0.67) for the CRO and BERLIN cohorts, and the best HD95 (12.4 mm) for the BERLIN cohort. CONCLUSION: The regular Unet outperformed the other two baseline models in CV and most external testing cohorts. Facility-specific transfer learning can potentially improve HNC segmentation performance for individual institutions, where the 2D Retina Unets could achieve comparable or even better results than the regular Unet.


Subject(s)
Deep Learning , Head and Neck Neoplasms , Humans , Positron Emission Tomography Computed Tomography , Reproducibility of Results , Head and Neck Neoplasms/diagnostic imaging , Positron-Emission Tomography
15.
Theranostics ; 14(3): 973-987, 2024.
Article in English | MEDLINE | ID: mdl-38250039

ABSTRACT

Rationale: Multimodal imaging provides important pharmacokinetic and dosimetry information during nanomedicine development and optimization. However, accurate quantitation is time-consuming, resource intensive, and requires anatomical expertise. Methods: We present NanoMASK: a 3D U-Net adapted deep learning tool capable of rapid, automatic organ segmentation of multimodal imaging data that can output key clinical dosimetry metrics without manual intervention. This model was trained on 355 manually-contoured PET/CT data volumes of mice injected with a variety of nanomaterials and imaged over 48 hours. Results: NanoMASK produced 3-dimensional contours of the heart, lungs, liver, spleen, kidneys, and tumor with high volumetric accuracy (pan-organ average %DSC of 92.5). Pharmacokinetic metrics including %ID/cc, %ID, and SUVmax achieved correlation coefficients exceeding R = 0.987 and relative mean errors below 0.2%. NanoMASK was applied to novel datasets of lipid nanoparticles and antibody-drug conjugates with a minimal drop in accuracy, illustrating its generalizability to different classes of nanomedicines. Furthermore, 20 additional auto-segmentation models were developed using training data subsets based on image modality, experimental imaging timepoint, and tumor status. These were used to explore the fundamental biases and dependencies of auto-segmentation models built on a 3D U-Net architecture, revealing significant differential impacts on organ segmentation accuracy. Conclusions: NanoMASK is an easy-to-use, adaptable tool for improving accuracy and throughput in imaging-based pharmacokinetic studies of nanomedicine. It has been made publicly available to all readers for automatic segmentation and pharmacokinetic analysis across a diverse array of nanoparticles, expediting agent development.


Subject(s)
Deep Learning , Neoplasms , Animals , Mice , Nanomedicine , Positron Emission Tomography Computed Tomography , Heart
16.
Phys Imaging Radiat Oncol ; 29: 100527, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38222671

ABSTRACT

Background and purpose: Autocontouring for radiotherapy has the potential to significantly save time and reduce interobserver variability. We aimed to assess the performance of a commercial autocontouring model for head and neck (H&N) patients in eight orientations relevant to particle therapy with fixed beam lines, focusing on validation and implementation for routine clinical use. Materials and methods: Autocontouring was performed on sixteen organs at risk (OARs) for 98 adult and pediatric patients with 137 H&N CT scans in eight orientations. A geometric comparison of the autocontours and manual segmentations was performed using the Hausdorff Distance 95th percentile, Dice Similarity Coefficient (DSC) and surface DSC and compared to interobserver variability where available. Additional qualitative scoring and dose-volume-histogram (DVH) parameters analyses were performed for twenty patients in two positions, consisting of scoring on a 0-3 scale based on clinical usability and comparing the mean (Dmean) and near-maximum (D2%) dose, respectively. Results: For the geometric analysis, the model performance in head-first-supine straight and hyperextended orientations was in the same range as the interobserver variability. HD95, DSC and surface DSC was heterogeneous in other orientations. No significant geometric differences were found between pediatric and adult autocontours. The qualitative scoring yielded a median score of ≥ 2 for 13/16 OARs while 7/32 DVH parameters were significantly different. Conclusions: For head-first-supine straight and hyperextended scans, we found that 13/16 OAR autocontours were suited for use in daily clinical practice and subsequently implemented. Further development is needed for other patient orientations before implementation.

17.
Med Phys ; 51(4): 2665-2677, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37888789

ABSTRACT

BACKGROUND: Accurate segmentation of the clinical target volume (CTV) corresponding to the prostate with or without proximal seminal vesicles is required on transrectal ultrasound (TRUS) images during prostate brachytherapy procedures. Implanted needles cause artifacts that may make this task difficult and time-consuming. Thus, previous studies have focused on the simpler problem of segmentation in the absence of needles at the cost of reduced clinical utility. PURPOSE: To use a convolutional neural network (CNN) algorithm for segmentation of the prostatic CTV in TRUS images post-needle insertion obtained from prostate brachytherapy procedures to better meet the demands of the clinical procedure. METHODS: A dataset consisting of 144 3-dimensional (3D) TRUS images with implanted metal brachytherapy needles and associated manual CTV segmentations was used for training a 2-dimensional (2D) U-Net CNN using a Dice Similarity Coefficient (DSC) loss function. These were split by patient, with 119 used for training and 25 reserved for testing. The 3D TRUS training images were resliced at radial (around the axis normal to the coronal plane) and oblique angles through the center of the 3D image, as well as axial, coronal, and sagittal planes to obtain 3689 2D TRUS images and masks for training. The network generated boundary predictions on 300 2D TRUS images obtained from reslicing each of the 25 3D TRUS images used for testing into 12 radial slices (15° apart), which were then reconstructed into 3D surfaces. Performance metrics included DSC, recall, precision, unsigned and signed volume percentage differences (VPD/sVPD), mean surface distance (MSD), and Hausdorff distance (HD). In addition, we studied whether providing algorithm-predicted boundaries to the physicians and allowing modifications increased the agreement between physicians. This was performed by providing a subset of 3D TRUS images of five patients to five physicians who segmented the CTV using clinical software and repeated this at least 1 week apart. The five physicians were given the algorithm boundary predictions and allowed to modify them, and the resulting inter- and intra-physician variability was evaluated. RESULTS: Median DSC, recall, precision, VPD, sVPD, MSD, and HD of the 3D-reconstructed algorithm segmentations were 87.2 [84.1, 88.8]%, 89.0 [86.3, 92.4]%, 86.6 [78.5, 90.8]%, 10.3 [4.5, 18.4]%, 2.0 [-4.5, 18.4]%, 1.6 [1.2, 2.0] mm, and 6.0 [5.3, 8.0] mm, respectively. Segmentation time for a set of 12 2D radial images was 2.46 [2.44, 2.48] s. With and without U-Net starting points, the intra-physician median DSCs were 97.0 [96.3, 97.8]%, and 94.4 [92.5, 95.4]% (p < 0.0001), respectively, while the inter-physician median DSCs were 94.8 [93.3, 96.8]% and 90.2 [88.7, 92.1]%, respectively (p < 0.0001). The median segmentation time for physicians, with and without U-Net-generated CTV boundaries, were 257.5 [211.8, 300.0] s and 288.0 [232.0, 333.5] s, respectively (p = 0.1034). CONCLUSIONS: Our algorithm performed at a level similar to physicians in a fraction of the time. The use of algorithm-generated boundaries as a starting point and allowing modifications reduced physician variability, although it did not significantly reduce the time compared to manual segmentations.


Subject(s)
Brachytherapy , Deep Learning , Prostatic Neoplasms , Male , Humans , Prostate/diagnostic imaging , Brachytherapy/methods , Ultrasonography , Algorithms , Image Processing, Computer-Assisted/methods , Prostatic Neoplasms/diagnostic imaging , Prostatic Neoplasms/radiotherapy
18.
J Radiat Res ; 65(1): 1-9, 2024 Jan 19.
Article in English | MEDLINE | ID: mdl-37996085

ABSTRACT

This review provides an overview of the application of artificial intelligence (AI) in radiation therapy (RT) from a radiation oncologist's perspective. Over the years, advances in diagnostic imaging have significantly improved the efficiency and effectiveness of radiotherapy. The introduction of AI has further optimized the segmentation of tumors and organs at risk, thereby saving considerable time for radiation oncologists. AI has also been utilized in treatment planning and optimization, reducing the planning time from several days to minutes or even seconds. Knowledge-based treatment planning and deep learning techniques have been employed to produce treatment plans comparable to those generated by humans. Additionally, AI has potential applications in quality control and assurance of treatment plans, optimization of image-guided RT and monitoring of mobile tumors during treatment. Prognostic evaluation and prediction using AI have been increasingly explored, with radiomics being a prominent area of research. The future of AI in radiation oncology offers the potential to establish treatment standardization by minimizing inter-observer differences in segmentation and improving dose adequacy evaluation. RT standardization through AI may have global implications, providing world-standard treatment even in resource-limited settings. However, there are challenges in accumulating big data, including patient background information and correlating treatment plans with disease outcomes. Although challenges remain, ongoing research and the integration of AI technology hold promise for further advancements in radiation oncology.


Subject(s)
Neoplasms , Radiation Oncology , Radiotherapy, Image-Guided , Humans , Artificial Intelligence , Radiotherapy Planning, Computer-Assisted/methods , Neoplasms/radiotherapy , Radiation Oncology/methods
19.
Med Phys ; 51(4): 2741-2758, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38015793

ABSTRACT

BACKGROUND: For autosegmentation models, the data used to train the model (e.g., public datasets and/or vendor-collected data) and the data on which the model is deployed in the clinic are typically not the same, potentially impacting the performance of these models by a process called domain shift. Tools to routinely monitor and predict segmentation performance are needed for quality assurance. Here, we develop an approach to perform such monitoring and performance prediction for cardiac substructure segmentation. PURPOSE: To develop a quality assurance (QA) framework for routine or continuous monitoring of domain shift and the performance of cardiac substructure autosegmentation algorithms. METHODS: A benchmark dataset consisting of computed tomography (CT) images along with manual cardiac substructure delineations of 241 breast cancer radiotherapy patients were collected, including one "normal" image domain of clean images and five "abnormal" domains containing images with artifact (metal, contrast), pathology, or quality variations due to scanner protocol differences (field of view, noise, reconstruction kernel, and slice thickness). The QA framework consisted of an image domain shift detector which operated on the input CT images and a shape quality detector on the output of an autosegmentation model, and a regression model for predicting autosegmentation model performance. The image domain shift detector was composed of a trained denoising autoencoder (DAE) and two hand-engineered image quality features to detect normal versus abnormal domains in the input CT images. The shape quality detector was a variational autoencoder (VAE) trained to estimate the shape quality of the auto-segmentation results. The output from the image domain shift and shape quality detectors was used to train a regression model to predict the per-patient segmentation accuracy, measured by Dice coefficient similarity (DSC) to physician contours. Different regression techniques were investigated including linear regression, Bagging, Gaussian process regression, random forest, and gradient boost regression. Of the 241 patients, 60 were used to train the autosegmentation models, 120 for training the QA framework, and the remaining 61 for testing the QA framework. A total of 19 autosegmentation models were used to evaluate QA framework performance, including 18 convolutional neural network (CNN)-based and one transformer-based model. RESULTS: When tested on the benchmark dataset, all abnormal domains resulted in a significant DSC decrease relative to the normal domain for CNN models ( p < 0.001 $p < 0.001$ ), but only for some domains for the transformer model. No significant relationship was found between the performance of an autosegmentation model and scanner protocol parameters ( p = 0.42 $p = 0.42$ ) except noise ( p = 0.01 $p = 0.01$ ). CNN-based autosegmentation models demonstrated a decreased DSC ranging from 0.07 to 0.41 with added noise, while the transformer-based model was not significantly affected (ANOVA, p = 0.99 $p=0.99$ ). For the QA framework, linear regression models with bootstrap aggregation resulted in the highest mean absolute error (MAE) of 0.041 ± 0.002 $0.041 \pm 0.002$ , in predicted DSC (relative to true DSC between autosegmentation and physician). MAE was lowest when combining both input (image) detectors and output (shape) detectors compared to output detectors alone. CONCLUSIONS: A QA framework was able to predict cardiac substructure autosegmentation model performance for clinically anticipated "abnormal" domain shifts.


Subject(s)
Deep Learning , Humans , Tomography, X-Ray Computed/methods , Neural Networks, Computer , Heart/diagnostic imaging , Breast , Image Processing, Computer-Assisted/methods
20.
Radiother Oncol ; 191: 110061, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38122850

ABSTRACT

PURPOSE: Accurate and comprehensive segmentation of cardiac substructures is crucial for minimizing the risk of radiation-induced heart disease in lung cancer radiotherapy. We sought to develop and validate deep learning-based auto-segmentation models for cardiac substructures. MATERIALS AND METHODS: Nineteen cardiac substructures (whole heart, 4 heart chambers, 6 great vessels, 4 valves, and 4 coronary arteries) in 100 patients treated for non-small cell lung cancer were manually delineated by two radiation oncologists. The valves and coronary arteries were delineated as planning risk volumes. An nnU-Net auto-segmentation model was trained, validated, and tested on this dataset with a split ratio of 75:5:20. The auto-segmented contours were evaluated by comparing them with manually drawn contours in terms of Dice similarity coefficient (DSC) and dose metrics extracted from clinical plans. An independent dataset of 42 patients was used for subjective evaluation of the auto-segmentation model by 4 physicians. RESULTS: The average DSCs were 0.95 (+/- 0.01) for the whole heart, 0.91 (+/- 0.02) for 4 chambers, 0.86 (+/- 0.09) for 6 great vessels, 0.81 (+/- 0.09) for 4 valves, and 0.60 (+/- 0.14) for 4 coronary arteries. The average absolute errors in mean/max doses to all substructures were 1.04 (+/- 1.99) Gy and 2.20 (+/- 4.37) Gy. The subjective evaluation revealed that 94% of the auto-segmented contours were clinically acceptable. CONCLUSION: We demonstrated the effectiveness of our nnU-Net model for delineating cardiac substructures, including coronary arteries. Our results indicate that this model has promise for studies regarding radiation dose to cardiac substructures.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Deep Learning , Lung Neoplasms , Humans , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/radiotherapy , Carcinoma, Non-Small-Cell Lung/diagnostic imaging , Carcinoma, Non-Small-Cell Lung/radiotherapy , Radiotherapy Planning, Computer-Assisted/methods , Heart/diagnostic imaging , Organs at Risk
SELECTION OF CITATIONS
SEARCH DETAIL
...