Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
BJR Artif Intell ; 1(1): ubae006, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38828430

RESUMO

Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.

2.
J Biopharm Stat ; : 1-19, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38889012

RESUMO

BACKGROUND: Positive and negative likelihood ratios (PLR and NLR) are important metrics of accuracy for diagnostic devices with a binary output. However, the properties of Bayesian and frequentist interval estimators of PLR/NLR have not been extensively studied and compared. In this study, we explore the potential use of the Bayesian method for interval estimation of PLR/NLR, and, more broadly, for interval estimation of the ratio of two independent proportions. METHODS: We develop a Bayesian-based approach for interval estimation of PLR/NLR for use as a part of a diagnostic device performance evaluation. Our approach is applicable to a broader setting for interval estimation of any ratio of two independent proportions. We compare score and Bayesian interval estimators for the ratio of two proportions in terms of the coverage probability (CP) and expected interval width (EW) via extensive experiments and applications to two case studies. A supplementary experiment was also conducted to assess the performance of the proposed exact Bayesian method under different priors. RESULTS: Our experimental results show that the overall mean CP for Bayesian interval estimation is consistent with that for the score method (0.950 vs. 0.952), and the overall mean EW for Bayesian is shorter than that for score method (15.929 vs. 19.724). Application to two case studies showed that the intervals estimated using the Bayesian and frequentist approaches are very similar. DISCUSSION: Our numerical results indicate that the proposed Bayesian approach has a comparable CP performance with the score method while yielding higher precision (i.e. a shorter EW).

3.
BJR Artif Intell ; 1(1): ubae003, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38476957

RESUMO

The adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples. We also highlight what we see as the shared responsibility of manufacturers or vendors, regulators, healthcare systems, medical physicists, and clinicians to enact appropriate testing and oversight to ensure a safe and equitable transformation of medicine through AI.

4.
J Med Imaging (Bellingham) ; 11(1): 017502, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38370423

RESUMO

Purpose: Endometrial cancer (EC) is the most common gynecologic malignancy in the United States, and atypical endometrial hyperplasia (AEH) is considered a high-risk precursor to EC. Hormone therapies and hysterectomy are practical treatment options for AEH and early-stage EC. Some patients prefer hormone therapies for reasons such as fertility preservation or being poor surgical candidates. However, accurate prediction of an individual patient's response to hormonal treatment would allow for personalized and potentially improved recommendations for these conditions. This study aims to explore the feasibility of using deep learning models on whole slide images (WSI) of endometrial tissue samples to predict the patient's response to hormonal treatment. Approach: We curated a clinical WSI dataset of 112 patients from two clinical sites. An expert pathologist annotated these images by outlining AEH/EC regions. We developed an end-to-end machine learning model with mixed supervision. The model is based on image patches extracted from pathologist-annotated AEH/EC regions. Either an unsupervised deep learning architecture (Autoencoder or ResNet50), or non-deep learning (radiomics feature extraction) is used to embed the images into a low-dimensional space, followed by fully connected layers for binary prediction, which was trained with binary responder/non-responder labels established by pathologists. We used stratified sampling to partition the dataset into a development set and a test set for internal validation of the performance of our models. Results: The autoencoder model yielded an AUROC of 0.80 with 95% CI [0.63, 0.95] on the independent test set for the task of predicting a patient with AEH/EC as a responder vs non-responder to hormonal treatment. Conclusions: These findings demonstrate the potential of using mixed supervised machine learning models on WSIs for predicting the response to hormonal treatment in AEH/EC patients.

5.
J Med Imaging (Bellingham) ; 11(1): 014501, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38283653

RESUMO

Purpose: Understanding an artificial intelligence (AI) model's ability to generalize to its target population is critical to ensuring the safe and effective usage of AI in medical devices. A traditional generalizability assessment relies on the availability of large, diverse datasets, which are difficult to obtain in many medical imaging applications. We present an approach for enhanced generalizability assessment by examining the decision space beyond the available testing data distribution. Approach: Vicinal distributions of virtual samples are generated by interpolating between triplets of test images. The generated virtual samples leverage the characteristics already in the test set, increasing the sample diversity while remaining close to the AI model's data manifold. We demonstrate the generalizability assessment approach on the non-clinical tasks of classifying patient sex, race, COVID status, and age group from chest x-rays. Results: Decision region composition analysis for generalizability indicated that a disproportionately large portion of the decision space belonged to a single "preferred" class for each task, despite comparable performance on the evaluation dataset. Evaluation using cross-reactivity and population shift strategies indicated a tendency to overpredict samples as belonging to the preferred class (e.g., COVID negative) for patients whose subgroup was not represented in the model development data. Conclusions: An analysis of an AI model's decision space has the potential to provide insight into model generalizability. Our approach uses the analysis of composition of the decision space to obtain an improved assessment of model generalizability in the case of limited test data.

6.
Clin Pharmacol Ther ; 115(4): 745-757, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-37965805

RESUMO

In 2020, Novartis Pharmaceuticals Corporation and the U.S. Food and Drug Administration (FDA) started a 4-year scientific collaboration to approach complex new data modalities and advanced analytics. The scientific question was to find novel radio-genomics-based prognostic and predictive factors for HR+/HER- metastatic breast cancer under a Research Collaboration Agreement. This collaboration has been providing valuable insights to help successfully implement future scientific projects, particularly using artificial intelligence and machine learning. This tutorial aims to provide tangible guidelines for a multi-omics project that includes multidisciplinary expert teams, spanning across different institutions. We cover key ideas, such as "maintaining effective communication" and "following good data science practices," followed by the four steps of exploratory projects, namely (1) plan, (2) design, (3) develop, and (4) disseminate. We break each step into smaller concepts with strategies for implementation and provide illustrations from our collaboration to further give the readers actionable guidance.


Assuntos
Inteligência Artificial , Multiômica , Humanos , Aprendizado de Máquina , Genômica
7.
Artigo em Inglês | MEDLINE | ID: mdl-38083445

RESUMO

Labeled ECG data in diseased state are, however, relatively scarce due to various concerns including patient privacy and low prevalence. We propose the first study in its kind that synthesizes atrial fibrillation (AF)-like ECG signals from normal ECG signals using the AFE-GAN, a generative adversarial network. Our AFE-GAN adjusts both beat morphology and rhythm variability when generating the atrial fibrillation-like ECG signals. Two publicly available arrhythmia detectors classified 72.4% and 77.2% of our generated signals as AF in a four-class (normal, AF, other abnormal, noisy) classification. This work shows the feasibility to synthesize abnormal ECG signals from normal ECG signals.Clinical significance - The AF ECG signal generated with our AFE-GAN has the potential to be used as training materials for health practitioners or be used as class-balance supplements for training automatic AF detectors.


Assuntos
Fibrilação Atrial , Humanos , Fibrilação Atrial/diagnóstico , Eletrocardiografia , Doença do Sistema de Condução Cardíaco
8.
Artigo em Inglês | MEDLINE | ID: mdl-37159719

RESUMO

Endometrial cancer (EC) is the most common gynecologic malignancy in the US and complex atypical hyperplasia (CAH) is considered a high-risk precursor to EC. Treatment options for CAH and early-stage EC include hormone therapies and hysterectomy with the former preferred by certain patients, e.g., for fertility preservation or poor surgical candidates. Accurate prediction of response to hormonal treatment would allow for personalized and potentially improved recommendations for the treatment of these conditions. In this study, we investigate the feasibility of utilizing weakly supervised deep learning models on whole slide images of endometrial tissue samples for the prediction of patient response to hormonal treatment. We curated a clinical whole-slide-image (WSI) dataset of 112 patients from two clinical sites. We developed an end-to-end machine learning model using WSIs of endometrial specimens for the prediction of hormonal treatment response among women with CAH/EC. The model takes patches extracted from pathologist-annotated CAH/EC regions as input and utilizes an unsupervised deep learning architecture (Autoencoder or ResNet50) to embed the images into a low-dimensional space, followed by fully connected layers for binary prediction. Our autoencoder model yielded an AUC of 0.79 with 95% CI [0.61, 0.98] on a hold-out test set in the task of predicting a patient with CAH/EC as a responder vs non-responder to hormonal treatment. Our results, demonstrate the potential for using weakly supervised machine learning models on WSIs for predicting response to hormonal treatment of CAH/EC patients.

9.
JAMA Netw Open ; 6(2): e230524, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36821110

RESUMO

Importance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide. Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods. Design, Setting, and Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22 032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021. Main Outcomes and Measures: The overall performance was evaluated by mean sensitivity for biopsied lesions using only DBT volumes with biopsied lesions; ties were broken by including all DBT volumes. Results: A total of 8 teams participated in the challenge. The team with the highest mean sensitivity for biopsied lesions was the NYU B-Team, with 0.957 (95% CI, 0.924-0.984), and the second-place team, ZeDuS, had a mean sensitivity of 0.926 (95% CI, 0.881-0.964). When the results were aggregated, the mean sensitivity for all submitted algorithms was 0.879; for only those who participated in phase 2, it was 0.926. Conclusions and Relevance: In this diagnostic study, an international competition produced algorithms with high sensitivity for using AI to detect lesions on DBT images. A standardized performance benchmark for the detection task using publicly available clinical imaging data was released, with detailed descriptions and analyses of submitted algorithms accompanied by a public release of their predictions and code for selected methods. These resources will serve as a foundation for future research on computer-assisted diagnosis methods for DBT, significantly lowering the barrier of entry for new researchers.


Assuntos
Inteligência Artificial , Neoplasias da Mama , Humanos , Feminino , Benchmarking , Mamografia/métodos , Algoritmos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Neoplasias da Mama/diagnóstico por imagem
10.
Med Phys ; 50(2): e1-e24, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36565447

RESUMO

Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods. Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic. To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.


Assuntos
Inteligência Artificial , Diagnóstico por Computador , Humanos , Reprodutibilidade dos Testes , Diagnóstico por Computador/métodos , Diagnóstico por Imagem , Aprendizado de Máquina
11.
BMC Bioinformatics ; 23(1): 544, 2022 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-36526957

RESUMO

BACKGROUND: The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters. The efficacy of parallelization depends on the data partitioning, where the optimal data partitioning relies on an accurate performance model. In previous studies, a BLAST job was sped up by 27 times by partitioning the database and query among thousands of processor nodes. However, the optimality of the partitioning method was not studied. Unlike BLAST performance models proposed in the literature that usually have problem size and hardware configuration as the only variables, the execution time of a BLAST job is a function of database size, query size, and hardware capability. In this work, the nucleotide BLAST application BLASTN was profiled using three methods: shell-level profiling with the Unix "time" command, code-level profiling with the built-in "profiler" module, and system-level profiling with the Unix "gprof" program. The runtimes were measured for six node types, using six different database files and 15 query files, on a heterogeneous HPC cluster with 500+ nodes. The empirical measurement data were fitted with quadratic functions to develop performance models that were used to guide the data parallelization for BLASTN jobs. RESULTS: Profiling results showed that BLASTN contains more than 34,500 different functions, but a single function, RunMTBySplitDB, takes 99.12% of the total runtime. Among its 53 child functions, five core functions were identified to make up 92.12% of the overall BLASTN runtime. Based on the performance models, static load balancing algorithms can be applied to the BLASTN input data to minimize the runtime of the longest job on an HPC cluster. Four test cases being run on homogeneous and heterogeneous clusters were tested. Experiment results showed that the runtime can be reduced by 81% on a homogeneous cluster and by 20% on a heterogeneous cluster by re-distributing the workload. DISCUSSION: Optimal data partitioning can improve BLASTN's overall runtime 5.4-fold in comparison with dividing the database and query into the same number of fragments. The proposed methodology can be used in the other applications in the BLAST+ suite or any other application as long as source code is available.


Assuntos
Metodologias Computacionais , Software , Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência
12.
Tomography ; 8(2): 644-656, 2022 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-35314631

RESUMO

This observer study investigates the effect of computerized artificial intelligence (AI)-based decision support system (CDSS-T) on physicians' diagnostic accuracy in assessing bladder cancer treatment response. The performance of 17 observers was evaluated when assessing bladder cancer treatment response without and with CDSS-T using pre- and post-chemotherapy CTU scans in 123 patients having 157 pre- and post-treatment cancer pairs. The impact of cancer case difficulty, observers' clinical experience, institution affiliation, specialty, and the assessment times on the observers' diagnostic performance with and without using CDSS-T were analyzed. It was found that the average performance of the 17 observers was significantly improved (p = 0.002) when aided by the CDSS-T. The cancer case difficulty, institution affiliation, specialty, and the assessment times influenced the observers' performance without CDSS-T. The AI-based decision support system has the potential to improve the diagnostic accuracy in assessing bladder cancer treatment response and result in more consistent performance among all physicians.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Neoplasias da Bexiga Urinária , Inteligência Artificial , Humanos , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/terapia , Urografia
13.
J Med Imaging (Bellingham) ; 8(3): 034501, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33987451

RESUMO

Purpose: The breast pathology quantitative biomarkers (BreastPathQ) challenge was a grand challenge organized jointly by the International Society for Optics and Photonics (SPIE), the American Association of Physicists in Medicine (AAPM), the U.S. National Cancer Institute (NCI), and the U.S. Food and Drug Administration (FDA). The task of the BreastPathQ challenge was computerized estimation of tumor cellularity (TC) in breast cancer histology images following neoadjuvant treatment. Approach: A total of 39 teams developed, validated, and tested their TC estimation algorithms during the challenge. The training, validation, and testing sets consisted of 2394, 185, and 1119 image patches originating from 63, 6, and 27 scanned pathology slides from 33, 4, and 18 patients, respectively. The summary performance metric used for comparing and ranking algorithms was the average prediction probability concordance (PK) using scores from two pathologists as the TC reference standard. Results: Test PK performance ranged from 0.497 to 0.941 across the 100 submitted algorithms. The submitted algorithms generally performed well in estimating TC, with high-performing algorithms obtaining comparable results to the average interrater PK of 0.927 from the two pathologists providing the reference TC scores. Conclusions: The SPIE-AAPM-NCI BreastPathQ challenge was a success, indicating that artificial intelligence/machine learning algorithms may be able to approach human performance for cellularity assessment and may have some utility in clinical practice for improving efficiency and reducing reader variability. The BreastPathQ challenge can be accessed on the Grand Challenge website.

14.
Tomography ; 7(1): 10-19, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33681460

RESUMO

(1) Purpose: The objective was to evaluate CT perfusion and radiomic features for prediction of one year disease free survival in laryngeal and hypopharyngeal cancer. (2) Method and Materials: This retrospective study included pre and post therapy CT neck studies in 36 patients with laryngeal/hypopharyngeal cancer. Tumor contouring was performed semi-autonomously by the computer and manually by two radiologists. Twenty-six radiomic features including morphological and gray-level features were extracted by an internally developed and validated computer-aided image analysis system. The five perfusion features analyzed included permeability surface area product (PS), blood flow (flow), blood volume (BV), mean transit time (MTT), and time-to-maximum (Tmax). One year persistent/recurrent disease data were obtained following the final treatment of definitive chemoradiation or after total laryngectomy. We performed a two-loop leave-one-out feature selection and linear discriminant analysis classifier with generation of receiver operating characteristic (ROC) curves and confidence intervals (CI). (3) Results: 10 patients (28%) had recurrence/persistent disease at 1 year. For prediction, the change in blood flow demonstrated a training AUC of 0.68 (CI 0.47-0.85) and testing AUC of 0.66 (CI 0.47-0.85). The best features selected were a combination of perfusion and radiomic features including blood flow and computer-estimated percent volume changes-training AUC of 0.68 (CI 0.5-0.85) and testing AUC of 0.69 (CI 0.5-0.85). The laryngoscopic percent change in volume was a poor predictor with a testing AUC of 0.4 (CI 0.16-0.57). (4) Conclusions: A combination of CT perfusion and radiomic features are potential predictors of one-year disease free survival in laryngeal and hypopharyngeal cancer patients.


Assuntos
Neoplasias Hipofaríngeas , Intervalo Livre de Doença , Humanos , Neoplasias Hipofaríngeas/diagnóstico por imagem , Neoplasias Hipofaríngeas/cirurgia , Recidiva Local de Neoplasia , Perfusão , Projetos Piloto , Estudos Retrospectivos , Tomografia Computadorizada por Raios X
15.
Tomography ; 6(2): 194-202, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32548296

RESUMO

We evaluated the intraobserver variability of physicians aided by a computerized decision-support system for treatment response assessment (CDSS-T) to identify patients who show complete response to neoadjuvant chemotherapy for bladder cancer, and the effects of the intraobserver variability on physicians' assessment accuracy. A CDSS-T tool was developed that uses a combination of deep learning neural network and radiomic features from computed tomography (CT) scans to detect bladder cancers that have fully responded to neoadjuvant treatment. Pre- and postchemotherapy CT scans of 157 bladder cancers from 123 patients were collected. In a multireader, multicase observer study, physician-observers estimated the likelihood of pathologic T0 disease by viewing paired pre/posttreatment CT scans placed side by side on an in-house-developed graphical user interface. Five abdominal radiologists, 4 diagnostic radiology residents, 2 oncologists, and 1 urologist participated as observers. They first provided an estimate without CDSS-T and then with CDSS-T. A subset of cases was evaluated twice to study the intraobserver variability and its effects on observer consistency. The mean areas under the curves for assessment of pathologic T0 disease were 0.85 for CDSS-T alone, 0.76 for physicians without CDSS-T and improved to 0.80 for physicians with CDSS-T (P = .001) in the original evaluation, and 0.78 for physicians without CDSS-T and improved to 0.81 for physicians with CDSS-T (P = .010) in the repeated evaluation. The intraobserver variability was significantly reduced with CDSS-T (P < .0001). The CDSS-T can significantly reduce physicians' variability and improve their accuracy for identifying complete response of muscle-invasive bladder cancer to neoadjuvant chemotherapy.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Neoplasias da Bexiga Urinária , Humanos , Variações Dependentes do Observador , Médicos , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/tratamento farmacológico
16.
J Med Imaging (Bellingham) ; 7(1): 012703, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31763356

RESUMO

We evaluated whether using synthetic mammograms for training data augmentation may reduce the effects of overfitting and increase the performance of a deep learning algorithm for breast mass detection. Synthetic mammograms were generated using in silico procedural analytic breast and breast mass modeling algorithms followed by simulated x-ray projections of the breast models into mammographic images. In silico breast phantoms containing masses were modeled across the four BI-RADS breast density categories, and the masses were modeled with different sizes, shapes, and margins. A Monte Carlo-based x-ray transport simulation code, MC-GPU, was used to project the three-dimensional phantoms into realistic synthetic mammograms. 2000 mammograms with 2522 masses were generated to augment a real data set during training. From the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) data set, we used 1111 mammograms (1198 masses) for training, 120 mammograms (120 masses) for validation, and 361 mammograms (378 masses) for testing. We used faster R-CNN for our deep learning network with pretraining from ImageNet using the Resnet-101 architecture. We compared the detection performance when the network was trained using different percentages of the real CBIS-DDSM training set (100%, 50%, and 25%), and when these subsets of the training set were augmented with 250, 500, 1000, and 2000 synthetic mammograms. Free-response receiver operating characteristic (FROC) analysis was performed to compare performance with and without the synthetic mammograms. We generally observed an improved test FROC curve when training with the synthetic images compared to training without them, and the amount of improvement depended on the number of real and synthetic images used in training. Our study shows that enlarging the training data with synthetic samples can increase the performance of deep learning systems.

17.
IEEE Trans Med Imaging ; 38(3): 686-696, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-31622238

RESUMO

In this paper, we developed a deep convolutional neural network (CNN) for the classification of malignant and benign masses in digital breast tomosynthesis (DBT) using a multi-stage transfer learning approach that utilized data from similar auxiliary domains for intermediate-stage fine-tuning. Breast imaging data from DBT, digitized screen-film mammography, and digital mammography totaling 4039 unique regions of interest (1797 malignant and 2242 benign) were collected. Using cross validation, we selected the best transfer network from six transfer networks by varying the level up to which the convolutional layers were frozen. In a single-stage transfer learning approach, knowledge from CNN trained on the ImageNet data was fine-tuned directly with the DBT data. In a multi-stage transfer learning approach, knowledge learned from ImageNet was first fine-tuned with the mammography data and then fine-tuned with the DBT data. Two transfer networks were compared for the second-stage transfer learning by freezing most of the CNN structures versus freezing only the first convolutional layer. We studied the dependence of the classification performance on training sample size for various transfer learning and fine-tuning schemes by varying the training data from 1% to 100% of the available sets. The area under the receiver operating characteristic curve (AUC) was used as a performance measure. The view-based AUC on the test set for single-stage transfer learning was 0.85 ± 0.05 and improved significantly (p <; 0.05$ ) to 0.91 ± 0.03 for multi-stage learning. This paper demonstrated that, when the training sample size from the target domain is limited, an additional stage of transfer learning using data from a similar auxiliary domain is advantageous.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado de Máquina , Mamografia/métodos , Redes Neurais de Computação , Área Sob a Curva , Humanos , Michigan , Tamanho da Amostra
18.
Tomography ; 5(1): 201-208, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30854458

RESUMO

We compared the performance of different Deep learning-convolutional neural network (DL-CNN) models for bladder cancer treatment response assessment based on transfer learning by freezing different DL-CNN layers and varying the DL-CNN structure. Pre- and posttreatment computed tomography scans of 123 patients (cancers, 129; pre- and posttreatment cancer pairs, 158) undergoing chemotherapy were collected. After chemotherapy 33% of patients had T0 stage cancer (complete response). Regions of interest in pre- and posttreatment scans were extracted from the segmented lesions and combined into hybrid pre -post image pairs (h-ROIs). Training (pairs, 94; h-ROIs, 6209), validation (10 pairs) and test sets (54 pairs) were obtained. The DL-CNN consisted of 2 convolution (C1-C2), 2 locally connected (L3-L4), and 1 fully connected layers. The DL-CNN was trained with h-ROIs to classify cancers as fully responding (stage T0) or not fully responding to chemotherapy. Two radiologists provided lesion likelihood of being stage T0 posttreatment. The test area under the ROC curve (AUC) was 0.73 for T0 prediction by the base DL-CNN structure with randomly initialized weights. The base DL-CNN structure with pretrained weights and transfer learning (no frozen layers) achieved test AUC of 0.79. The test AUCs for 3 modified DL-CNN structures (different C1-C2 max pooling filter sizes, strides, and padding, with transfer learning) were 0.72, 0.86, and 0.69. For the base DL-CNN with (C1) frozen, (C1-C2) frozen, and (C1-C2-L3) frozen, the test AUCs were 0.81, 0.78, and 0.71, respectively. The radiologists' AUCs were 0.76 and 0.77. DL-CNN performed better with pretrained than randomly initialized weights.


Assuntos
Aprendizado Profundo , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/tratamento farmacológico , Antineoplásicos/uso terapêutico , Cistectomia , Sistemas de Apoio a Decisões Clínicas , Monitoramento de Medicamentos/métodos , Humanos , Terapia Neoadjuvante/métodos , Curva ROC , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Sensibilidade e Especificidade , Tomografia Computadorizada por Raios X/métodos , Transferência de Experiência , Resultado do Tratamento , Urografia/métodos
19.
Med Phys ; 46(4): 1752-1765, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30734932

RESUMO

OBJECTIVES: To develop a U-Net-based deep learning approach (U-DL) for bladder segmentation in computed tomography urography (CTU) as a part of a computer-assisted bladder cancer detection and treatment response assessment pipeline. MATERIALS AND METHODS: A dataset of 173 cases including 81 cases in the training/validation set (42 masses, 21 with wall thickening, 18 normal bladders), and 92 cases in the test set (43 masses, 36 with wall thickening, 13 normal bladders) were used with Institutional Review Board approval. An experienced radiologist provided three-dimensional (3D) hand outlines for all cases as the reference standard. We previously developed a bladder segmentation method that used a deep learning convolution neural network and level sets (DCNN-LS) within a user-input bounding box. However, some cases with poor image quality or with advanced bladder cancer spreading into the neighboring organs caused inaccurate segmentation. We have newly developed an automated U-DL method to estimate a likelihood map of the bladder in CTU. The U-DL did not require a user-input box and the level sets for postprocessing. To identify the best model for this task, we compared the following models: (a) two-dimensional (2D) U-DL and 3D U-DL using 2D CT slices and 3D CT volumes, respectively, as input, (b) U-DLs using CT images of different resolutions as input, and (c) U-DLs with and without automated cropping of the bladder as an image preprocessing step. The segmentation accuracy relative to the reference standard was quantified by six measures: average volume intersection ratio (AVI), average percent volume error (AVE), average absolute volume error (AAVE), average minimum distance (AMD), average Hausdorff distance (AHD), and the average Jaccard index (AJI). As a baseline, the results from our previous DCNN-LS method were used. RESULTS: In the test set, the best 2D U-DL model achieved AVI, AVE, AAVE, AMD, AHD, and AJI values of 93.4 ± 9.5%, -4.2 ± 14.2%, 9.2 ± 11.5%, 2.7 ± 2.5 mm, 9.7 ± 7.6 mm, 85.0 ± 11.3%, respectively, while the corresponding measures by the best 3D U-DL were 90.6 ± 11.9%, -2.3 ± 21.7%, 11.5 ± 18.5%, 3.1 ± 3.2 mm, 11.4 ± 10.0 mm, and 82.6 ± 14.2%, respectively. For comparison, the corresponding values obtained with the baseline method were 81.9 ± 12.1%, 10.2 ± 16.2%, 14.0 ± 13.0%, 3.6 ± 2.0 mm, 12.8 ± 6.1 mm, and 76.2 ± 11.8%, respectively, for the same test set. The improvement for all measures between the best U-DL and the DCNN-LS were statistically significant (P < 0.001). CONCLUSION: Compared to a previous DCNN-LS method, which depended on a user-input bounding box, the U-DL provided more accurate bladder segmentation and was more automated than the previous approach.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Bexiga Urinária/diagnóstico por imagem , Algoritmos , Estudos de Casos e Controles , Humanos , Redes Neurais de Computação , Urografia/métodos
20.
Med Phys ; 46(1): e1-e36, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30367497

RESUMO

The goals of this review paper on deep learning (DL) in medical imaging and radiation therapy are to (a) summarize what has been achieved to date; (b) identify common and unique challenges, and strategies that researchers have taken to address these challenges; and (c) identify some of the promising avenues for the future both in terms of applications as well as technical innovations. We introduce the general principles of DL and convolutional neural networks, survey five major areas of application of DL in medical imaging and radiation therapy, identify common themes, discuss methods for dataset expansion, and conclude by summarizing lessons learned, remaining challenges, and future directions.


Assuntos
Aprendizado Profundo , Diagnóstico por Imagem/métodos , Radioterapia/métodos , Artefatos , Humanos , Processamento de Imagem Assistida por Computador , Razão Sinal-Ruído
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...