Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 89
Filter
1.
ArXiv ; 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-39010873

ABSTRACT

Lung injuries, such as ventilator-induced lung injury and radiation-induced lung injury, can lead to heterogeneous alterations in the biomechanical behavior of the lungs. While imaging methods, e.g., X-ray and static computed tomography (CT), can point to regional alterations in lung structure between healthy and diseased tissue, they fall short of delineating timewise kinematic variations between the former and the latter. Image registration has gained recent interest as a tool to estimate the displacement experienced by the lungs during respiration via regional deformation metrics such as volumetric expansion and distortion. However, successful image registration commonly relies on a temporal series of image stacks with small displacements in the lungs across succeeding image stacks, which remains limited in static imaging. In this study, we have presented a finite element (FE) method to estimate strains from static images acquired at the end-expiration (EE) and end-inspiration (EI) timepoints, i.e., images with a large deformation between the two distant timepoints. Physiologically realistic loads were applied to the geometry obtained at EE to deform this geometry to match the geometry obtained at EI. The results indicated that the simulation could minimize the error between the two geometries. Using four-dimensional (4D) dynamic CT in a rat, the strain at an isolated transverse plane estimated by our method showed sufficient agreement with that estimated through non-rigid image registration that used all the timepoints. Through the proposed method, we can estimate the lung deformation at any timepoint between EE and EI. The proposed method offers a tool to estimate timewise regional deformation in the lungs using only static images acquired at EE and EI.

2.
bioRxiv ; 2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38895261

ABSTRACT

The quantification of cardiac motion using cardiac magnetic resonance imaging (CMR) has shown promise as an early-stage marker for cardiovascular diseases. Despite the growing popularity of CMR-based myocardial strain calculations, measures of complete spatiotemporal strains (i.e., three-dimensional strains over the cardiac cycle) remain elusive. Complete spatiotemporal strain calculations are primarily hampered by poor spatial resolution, with the rapid motion of the cardiac wall also challenging the reproducibility of such strains. We hypothesize that a super-resolution reconstruction (SRR) framework that leverages combined image acquisitions at multiple orientations will enhance the reproducibility of complete spatiotemporal strain estimation. Two sets of CMR acquisitions were obtained for five wild-type mice, combining short-axis scans with radial and orthogonal long-axis scans. Super-resolution reconstruction, integrated with tissue classification, was performed to generate full four-dimensional (4D) images. The resulting enhanced and full 4D images enabled complete quantification of the motion in terms of 4D myocardial strains. Additionally, the effects of SRR in improving accurate strain measurements were evaluated using an in-silico heart phantom. The SRR framework revealed near isotropic spatial resolution, high structural similarity, and minimal loss of contrast, which led to overall improvements in strain accuracy. In essence, a comprehensive methodology was generated to quantify complete and reproducible myocardial deformation, aiding in the much-needed standardization of complete spatiotemporal strain calculations.

3.
ArXiv ; 2024 May 03.
Article in English | MEDLINE | ID: mdl-38745699

ABSTRACT

Background: The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. Purpose: The goal of this challenge was to promote the development of deep generative models for medical imaging and to emphasize the need for their domain-relevant assessments via the analysis of relevant image statistics. Methods: As part of this Grand Challenge, a common training dataset and an evaluation procedure was developed for benchmarking deep generative models for medical image synthesis. To create the training dataset, an established 3D virtual breast phantom was adapted. The resulting dataset comprised about 108,000 images of size 512×512. For the evaluation of submissions to the Challenge, an ensemble of 10,000 DGM-generated images from each submission was employed. The evaluation procedure consisted of two stages. In the first stage, a preliminary check for memorization and image quality (via the Fréchet Inception Distance (FID)) was performed. Submissions that passed the first stage were then evaluated for the reproducibility of image statistics corresponding to several feature families including texture, morphology, image moments, fractal statistics and skeleton statistics. A summary measure in this feature space was employed to rank the submissions. Additional analyses of submissions was performed to assess DGM performance specific to individual feature families, the four classes in the training data, and also to identify various artifacts. Results: Fifty-eight submissions from 12 unique users were received for this Challenge. Out of these 12 submissions, 9 submissions passed the first stage of evaluation and were eligible for ranking. The top-ranked submission employed a conditional latent diffusion model, whereas the joint runners-up employed a generative adversarial network, followed by another network for image superresolution. In general, we observed that the overall ranking of the top 9 submissions according to our evaluation method (i) did not match the FID-based ranking, and (ii) differed with respect to individual feature families. Another important finding from our additional analyses was that different DGMs demonstrated similar kinds of artifacts. Conclusions: This Grand Challenge highlighted the need for domain-specific evaluation to further DGM design as well as deployment. It also demonstrated that the specification of a DGM may differ depending on its intended use.

4.
J Med Imaging (Bellingham) ; 11(2): 024504, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38576536

ABSTRACT

Purpose: The Medical Imaging and Data Resource Center (MIDRC) was created to facilitate medical imaging machine learning (ML) research for tasks including early detection, diagnosis, prognosis, and assessment of treatment response related to the coronavirus disease 2019 pandemic and beyond. The purpose of this work was to create a publicly available metrology resource to assist researchers in evaluating the performance of their medical image analysis ML algorithms. Approach: An interactive decision tree, called MIDRC-MetricTree, has been developed, organized by the type of task that the ML algorithm was trained to perform. The criteria for this decision tree were that (1) users can select information such as the type of task, the nature of the reference standard, and the type of the algorithm output and (2) based on the user input, recommendations are provided regarding appropriate performance evaluation approaches and metrics, including literature references and, when possible, links to publicly available software/code as well as short tutorial videos. Results: Five types of tasks were identified for the decision tree: (a) classification, (b) detection/localization, (c) segmentation, (d) time-to-event (TTE) analysis, and (e) estimation. As an example, the classification branch of the decision tree includes two-class (binary) and multiclass classification tasks and provides suggestions for methods, metrics, software/code recommendations, and literature references for situations where the algorithm produces either binary or non-binary (e.g., continuous) output and for reference standards with negligible or non-negligible variability and unreliability. Conclusions: The publicly available decision tree is a resource to assist researchers in conducting task-specific performance evaluations, including classification, detection/localization, segmentation, TTE, and estimation tasks.

5.
J Med Imaging (Bellingham) ; 10(6): 064501, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38074627

ABSTRACT

Purpose: The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered commons for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated metadata become part of the open commons and 20% are sequestered from the open commons. To ensure that both commons are representative of the population available, we introduced a stratified sampling method to balance the demographic characteristics across the two datasets. Approach: Our method uses multi-dimensional stratified sampling where several demographic variables of interest are sequentially used to separate the data into individual strata, each representing a unique combination of variables. Within each resulting stratum, patients are assigned to the open or sequestered commons. This algorithm was used on an example dataset containing 5000 patients using the variables of race, age, sex at birth, ethnicity, COVID-19 status, and image modality and compared resulting demographic distributions to naïve random sampling of the dataset over 2000 independent trials. Results: Resulting prevalence of each demographic variable matched the prevalence from the input dataset within one standard deviation. Mann-Whitney U test results supported the hypothesis that sequestration by stratified sampling provided more balanced subsets than naïve randomization, except for demographic subcategories with very low prevalence. Conclusions: The developed multi-dimensional stratified sampling algorithm can partition a large dataset while maintaining balance across several variables, superior to the balance achieved from naïve randomization.

6.
J Med Imaging (Bellingham) ; 10(6): 61105, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37469387

ABSTRACT

Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary MIDRC dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC). Approach: The Jensen-Shannon distance (JSD), a measure of similarity of two distributions, was used to longitudinally measure the representativeness of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC. The distributions were evaluated in the demographic categories of age at index, sex, race, ethnicity, and the combination of race and ethnicity. Results: Representativeness of the MIDRC data by ethnicity and the combination of race and ethnicity was impacted by the percentage of CDC case counts for which this was not reported. The distributions by sex and race have retained their level of representativeness over time. Conclusion: The representativeness of the open medical imaging datasets in the curated public data commons at MIDRC has evolved over time as the number of contributing institutions and overall number of subjects have grown. The use of metrics, such as the JSD support measurement of representativeness, is one step needed for fair and generalizable AI algorithm development.

7.
J Med Imaging (Bellingham) ; 10(6): 061104, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37125409

ABSTRACT

Purpose: To recognize and address various sources of bias essential for algorithmic fairness and trustworthiness and to contribute to a just and equitable deployment of AI in medical imaging, there is an increasing interest in developing medical imaging-based machine learning methods, also known as medical imaging artificial intelligence (AI), for the detection, diagnosis, prognosis, and risk assessment of disease with the goal of clinical implementation. These tools are intended to help improve traditional human decision-making in medical imaging. However, biases introduced in the steps toward clinical deployment may impede their intended function, potentially exacerbating inequities. Specifically, medical imaging AI can propagate or amplify biases introduced in the many steps from model inception to deployment, resulting in a systematic difference in the treatment of different groups. Approach: Our multi-institutional team included medical physicists, medical imaging artificial intelligence/machine learning (AI/ML) researchers, experts in AI/ML bias, statisticians, physicians, and scientists from regulatory bodies. We identified sources of bias in AI/ML, mitigation strategies for these biases, and developed recommendations for best practices in medical imaging AI/ML development. Results: Five main steps along the roadmap of medical imaging AI/ML were identified: (1) data collection, (2) data preparation and annotation, (3) model development, (4) model evaluation, and (5) model deployment. Within these steps, or bias categories, we identified 29 sources of potential bias, many of which can impact multiple steps, as well as mitigation strategies. Conclusions: Our findings provide a valuable resource to researchers, clinicians, and the public at large.

8.
Vis Comput Ind Biomed Art ; 6(1): 9, 2023 May 18.
Article in English | MEDLINE | ID: mdl-37198498

ABSTRACT

The large language model called ChatGPT has drawn extensively attention because of its human-like expression and reasoning abilities. In this study, we investigate the feasibility of using ChatGPT in experiments on translating radiology reports into plain language for patients and healthcare providers so that they are educated for improved healthcare. Radiology reports from 62 low-dose chest computed tomography lung cancer screening scans and 76 brain magnetic resonance imaging metastases screening scans were collected in the first half of February for this study. According to the evaluation by radiologists, ChatGPT can successfully translate radiology reports into plain language with an average score of 4.27 in the five-point system with 0.08 places of information missing and 0.07 places of misinformation. In terms of the suggestions provided by ChatGPT, they are generally relevant such as keeping following-up with doctors and closely monitoring any symptoms, and for about 37% of 138 cases in total ChatGPT offers specific suggestions based on findings in the report. ChatGPT also presents some randomness in its responses with occasionally over-simplified or neglected information, which can be mitigated using a more detailed prompt. Furthermore, ChatGPT results are compared with a newly released large model GPT-4, showing that GPT-4 can significantly improve the quality of translated reports. Our results show that it is feasible to utilize large language models in clinical education, and further efforts are needed to address limitations and maximize their potential.

9.
IEEE Trans Med Imaging ; 42(6): 1799-1808, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37022374

ABSTRACT

In recent years, generative adversarial networks (GANs) have gained tremendous popularity for potential applications in medical imaging, such as medical image synthesis, restoration, reconstruction, translation, as well as objective image quality assessment. Despite the impressive progress in generating high-resolution, perceptually realistic images, it is not clear if modern GANs reliably learn the statistics that are meaningful to a downstream medical imaging application. In this work, the ability of a state-of-the-art GAN to learn the statistics of canonical stochastic image models (SIMs) that are relevant to objective assessment of image quality is investigated. It is shown that although the employed GAN successfully learned several basic first- and second-order statistics of the specific medical SIMs under consideration and generated images with high perceptual quality, it failed to correctly learn several per-image statistics pertinent to the these SIMs, highlighting the urgent need to assess medical image GANs in terms of objective measures of image quality.

10.
Med Phys ; 50(7): 4151-4172, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37057360

ABSTRACT

BACKGROUND: This study reports the results of a set of discrimination experiments using simulated images that represent the appearance of subtle lesions in low-dose computed tomography (CT) of the lungs. Noise in these images has a characteristic ramp-spectrum before apodization by noise control filters. We consider three specific diagnostic features that determine whether a lesion is considered malignant or benign, two system-resolution levels, and four apodization levels for a total of 24 experimental conditions. PURPOSE: The goal of the investigation is to better understand how well human observers perform subtle discrimination tasks like these, and the mechanisms of that performance. We use a forced-choice psychophysical paradigm to estimate observer efficiency and classification images. These measures quantify how effectively subjects can read the images, and how they use images to perform discrimination tasks across the different imaging conditions. MATERIALS AND METHODS: The simulated CT images used as stimuli in the psychophysical experiments are generated from high-resolution objects passed through a modulation transfer function (MTF) before down-sampling to the image-pixel grid. Acquisition noise is then added with a ramp noise-power spectrum (NPS), with subsequent smoothing through apodization filters. The features considered are lesion size, indistinct lesion boundary, and a nonuniform lesion interior. System resolution is implemented by an MTF with resolution (10% max.) of 0.47 or 0.58 cyc/mm. Apodization is implemented by a Shepp-Logan filter (Sinc profile) with various cutoffs. Six medically naïve subjects participated in the psychophysical studies, entailing training and testing components for each condition. Training consisted of staircase procedures to find the 80% correct threshold for each subject, and testing involved 2000 psychophysical trials at the threshold value for each subject. Human-observer performance is compared to the Ideal Observer to generate estimates of task efficiency. The significance of imaging factors is assessed using ANOVA. Classification images are used to estimate the linear template weights used by subjects to perform these tasks. Classification-image spectra are used to analyze subject weights in the spatial-frequency domain. RESULTS: Overall, average observer efficiency is relatively low in these experiments (10%-40%) relative to detection and localization studies reported previously. We find significant effects for feature type and apodization level on observer efficiency. Somewhat surprisingly, system resolution is not a significant factor. Efficiency effects of the different features appear to be well explained by the profile of the linear templates in the classification images. Increasingly strong apodization is found to both increase the classification-image weights and to increase the mean-frequency of the classification-image spectra. A secondary analysis of "Unapodized" classification images shows that this is largely due to observers undoing (inverting) the effects of apodization filters. CONCLUSIONS: These studies demonstrate that human observers can be relatively inefficient at feature-discrimination tasks in ramp-spectrum noise. Observers appear to be adapting to frequency suppression implemented in apodization filters, but there are residual effects that are not explained by spatial weighting patterns. The studies also suggest that the mechanisms for improving performance through the application of noise-control filters may require further investigation.


Subject(s)
Image Processing, Computer-Assisted , Tomography, X-Ray Computed , Humans , Image Processing, Computer-Assisted/methods , Phantoms, Imaging , Algorithms
11.
J Med Imaging (Bellingham) ; 9(Suppl 1): S12200, 2022 Feb.
Article in English | MEDLINE | ID: mdl-36247334

ABSTRACT

The article introduces the JMI Special Issue Celebrating 50 Years of SPIE Medical Imaging.

12.
J Med Imaging (Bellingham) ; 9(Suppl 1): 012207, 2022 Feb.
Article in English | MEDLINE | ID: mdl-35761820

ABSTRACT

Purpose: To commemorate the 50th anniversary of the first SPIE Medical Imaging meeting, we highlight some of the important publications published in the conference proceedings. Approach: We determined the top cited and downloaded papers. We also asked members of the editorial board of the Journal of Medical Imaging to select their favorite papers. Results: There was very little overlap between the three methods of highlighting papers. The downloads were mostly recent papers, whereas the favorite papers were mostly older papers. Conclusions: The three different methods combined provide an overview of the highlights of the papers published in the SPIE Medical Imaging conference proceedings over the last 50 years.

13.
Nat Mach Intell ; 4(11): 922-929, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36935774

ABSTRACT

The metaverse integrates physical and virtual realities, enabling humans and their avatars to interact in an environment supported by technologies such as high-speed internet, virtual reality, augmented reality, mixed and extended reality, blockchain, digital twins and artificial intelligence (AI), all enriched by effectively unlimited data. The metaverse recently emerged as social media and entertainment platforms, but extension to healthcare could have a profound impact on clinical practice and human health. As a group of academic, industrial, clinical and regulatory researchers, we identify unique opportunities for metaverse approaches in the healthcare domain. A metaverse of 'medical technology and AI' (MeTAI) can facilitate the development, prototyping, evaluation, regulation, translation and refinement of AI-based medical practice, especially medical imaging-guided diagnosis and therapy. Here, we present metaverse use cases, including virtual comparative scanning, raw data sharing, augmented regulatory science and metaversed medical intervention. We discuss relevant issues on the ecosystem of the MeTAI metaverse including privacy, security and disparity. We also identify specific action items for coordinated efforts to build the MeTAI metaverse for improved healthcare quality, accessibility, cost-effectiveness and patient satisfaction.

14.
Med Phys ; 49(2): 836-853, 2022 Feb.
Article in English | MEDLINE | ID: mdl-34954845

ABSTRACT

PURPOSE: Deep learning (DL) is rapidly finding applications in low-dose CT image denoising. While having the potential to improve the image quality (IQ) over the filtered back projection method (FBP) and produce images quickly, performance generalizability of the data-driven DL methods is not fully understood yet. The main purpose of this work is to investigate the performance generalizability of a low-dose CT image denoising neural network in data acquired under different scan conditions, particularly relating to these three parameters: reconstruction kernel, slice thickness, and dose (noise) level. A secondary goal is to identify any underlying data property associated with the CT scan settings that might help predict the generalizability of the denoising network. METHODS: We select the residual encoder-decoder convolutional neural network (REDCNN) as an example of a low-dose CT image denoising technique in this work. To study how the network generalizes on the three imaging parameters, we grouped the CT volumes in the Low-Dose Grand Challenge (LDGC) data into three pairs of training datasets according to their imaging parameters, changing only one parameter in each pair. We trained REDCNN with them to obtain six denoising models. We test each denoising model on datasets of matching and mismatching parameters with respect to its training sets regarding dose, reconstruction kernel, and slice thickness, respectively, to evaluate the denoising performance changes. Denoising performances are evaluated on patient scans, simulated phantom scans, and physical phantom scans using IQ metrics including mean-squared error (MSE), contrast-dependent modulation transfer function (MTF), pixel-level noise power spectrum (pNPS), and low-contrast lesion detectability (LCD). RESULTS: REDCNN had larger MSE when the testing data were different from the training data in reconstruction kernel, but no significant MSE difference when varying slice thickness in the testing data. REDCNN trained with quarter-dose data had slightly worse MSE in denoising higher-dose images than that trained with mixed-dose data (17%-80%). The MTF tests showed that REDCNN trained with the two reconstruction kernels and slice thicknesses yielded images of similar image resolution. However, REDCNN trained with mixed-dose data preserved the low-contrast resolution better compared to REDCNN trained with quarter-dose data. In the pNPS test, it was found that REDCNN trained with smooth-kernel data could not remove high-frequency noise in the test data of sharp kernel, possibly because the lack of high-frequency noise in the smooth-kernel data limited the ability of the trained model in removing high-frequency noise. Finally, in the LCD test, REDCNN improved the lesion detectability over the original FBP images regardless of whether the training and testing data had matching reconstruction kernels. CONCLUSIONS: REDCNN is observed to be poorly generalizable between reconstruction kernels, more robust in denoising data of arbitrary dose levels when trained with mixed-dose data, and not highly sensitive to slice thickness. It is known that reconstruction kernel affects the in-plane pNPS shape of a CT image, whereas slice thickness and dose level do not, so it is possible that the generalizability performance of this CT image denoising network highly correlates to the pNPS similarity between the testing and training data.


Subject(s)
Deep Learning , Algorithms , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer , Phantoms, Imaging , Radiation Dosage , Signal-To-Noise Ratio , Tomography, X-Ray Computed
17.
PET Clin ; 16(4): 493-511, 2021 Oct.
Article in English | MEDLINE | ID: mdl-34537127

ABSTRACT

Artificial intelligence-based methods are showing promise in medical imaging applications. There is substantial interest in clinical translation of these methods, requiring that they be evaluated rigorously. We lay out a framework for objective task-based evaluation of artificial intelligence methods. We provide a list of available tools to conduct this evaluation. We outline the important role of physicians in conducting these evaluation studies. The examples in this article are proposed in the context of PET scans with a focus on evaluating neural network-based methods. However, the framework is also applicable to evaluate other medical imaging modalities and other types of artificial intelligence methods.


Subject(s)
Artificial Intelligence , Physicians , Humans , Positron-Emission Tomography
19.
Cureus ; 12(2): e7144, 2020 Feb 29.
Article in English | MEDLINE | ID: mdl-32257689

ABSTRACT

Antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis (AAV) is a systemic necrotizing inflammation of the small vessels. Central nervous system (CNS) ANCA-associated vasculitis is a rare manifestation of AAV. Three mechanisms of AAV affecting the CNS have been reported which include contiguous granulomatous invasion from nasal and paranasal sinuses, remote granulomatous lesions, and vasculitis of small vessels. Chronic hypertrophic pachymeningitis (CHP) is the meningeal-site involvement in AAV caused by granulomatous inflammation in the dura mater. We present a case of pachymeningitis manifested with slowly progressive cognitive dysfunction, leptomeningeal enhancement on MRI, and necrotic vessels with surrounding inflammation on biopsy. This case represents a rare development of subsequent CNS AAV in a patient with ANCA-associated interstitial lung disease treated with rituximab with a resolution of leptomeningeal enhancement on a follow-up magnetic resonance imaging (MRI).

20.
J Med Imaging (Bellingham) ; 7(1): 012701, 2020 Jan.
Article in English | MEDLINE | ID: mdl-32206681

ABSTRACT

The editorial introduces the Special Section on Evaluation Methodologies for Clinical AI.

SELECTION OF CITATIONS
SEARCH DETAIL
...