Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Radiol Artif Intell ; 6(3): e230079, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38477661

RESUMEN

Purpose To evaluate the impact of an artificial intelligence (AI) assistant for lung cancer screening on multinational clinical workflows. Materials and Methods An AI assistant for lung cancer screening was evaluated on two retrospective randomized multireader multicase studies where 627 (141 cancer-positive cases) low-dose chest CT cases were each read twice (with and without AI assistance) by experienced thoracic radiologists (six U.S.-based or six Japan-based radiologists), resulting in a total of 7524 interpretations. Positive cases were defined as those within 2 years before a pathology-confirmed lung cancer diagnosis. Negative cases were defined as those without any subsequent cancer diagnosis for at least 2 years and were enriched for a spectrum of diverse nodules. The studies measured the readers' level of suspicion (on a 0-100 scale), country-specific screening system scoring categories, and management recommendations. Evaluation metrics included the area under the receiver operating characteristic curve (AUC) for level of suspicion and sensitivity and specificity of recall recommendations. Results With AI assistance, the radiologists' AUC increased by 0.023 (0.70 to 0.72; P = .02) for the U.S. study and by 0.023 (0.93 to 0.96; P = .18) for the Japan study. Scoring system specificity for actionable findings increased 5.5% (57% to 63%; P < .001) for the U.S. study and 6.7% (23% to 30%; P < .001) for the Japan study. There was no evidence of a difference in corresponding sensitivity between unassisted and AI-assisted reads for the U.S. (67.3% to 67.5%; P = .88) and Japan (98% to 100%; P > .99) studies. Corresponding stand-alone AI AUC system performance was 0.75 (95% CI: 0.70, 0.81) and 0.88 (95% CI: 0.78, 0.97) for the U.S.- and Japan-based datasets, respectively. Conclusion The concurrent AI interface improved lung cancer screening specificity in both U.S.- and Japan-based reader studies, meriting further study in additional international screening environments. Keywords: Assistive Artificial Intelligence, Lung Cancer Screening, CT Supplemental material is available for this article. Published under a CC BY 4.0 license.


Asunto(s)
Inteligencia Artificial , Detección Precoz del Cáncer , Neoplasias Pulmonares , Tomografía Computarizada por Rayos X , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/epidemiología , Japón , Estados Unidos/epidemiología , Estudios Retrospectivos , Detección Precoz del Cáncer/métodos , Femenino , Masculino , Persona de Mediana Edad , Anciano , Sensibilidad y Especificidad , Interpretación de Imagen Radiográfica Asistida por Computador/métodos
2.
Lancet Digit Health ; 6(2): e126-e130, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38278614

RESUMEN

Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components-GPPEs-from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended.


Asunto(s)
Atención a la Salud , Aprendizaje Automático , Humanos , Sesgo , Algoritmos
3.
Nat Med ; 29(7): 1814-1820, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37460754

RESUMEN

Predictive artificial intelligence (AI) systems based on deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings, but can make errors in cases accurately diagnosed by clinicians and vice versa. We developed Complementarity-Driven Deferral to Clinical Workflow (CoDoC), a system that can learn to decide between the opinion of a predictive AI model and a clinical workflow. CoDoC enhances accuracy relative to clinician-only or AI-only baselines in clinical workflows that screen for breast cancer or tuberculosis (TB). For breast cancer screening, compared to double reading with arbitration in a screening program in the UK, CoDoC reduced false positives by 25% at the same false-negative rate, while achieving a 66% reduction in clinician workload. For TB triaging, compared to standalone AI and clinical workflows, CoDoC achieved a 5-15% reduction in false positives at the same false-negative rate for three of five commercially available predictive AI systems. To facilitate the deployment of CoDoC in novel futuristic clinical settings, we present results showing that CoDoC's performance gains are sustained across several axes of variation (imaging modality, clinical setting and predictive AI system) and discuss the limitations of our evaluation and where further validation would be needed. We provide an open-source implementation to encourage further research and application.


Asunto(s)
Inteligencia Artificial , Triaje , Reproducibilidad de los Resultados , Flujo de Trabajo , Humanos
4.
Radiology ; 306(1): 124-137, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36066366

RESUMEN

Background The World Health Organization (WHO) recommends chest radiography to facilitate tuberculosis (TB) screening. However, chest radiograph interpretation expertise remains limited in many regions. Purpose To develop a deep learning system (DLS) to detect active pulmonary TB on chest radiographs and compare its performance to that of radiologists. Materials and Methods A DLS was trained and tested using retrospective chest radiographs (acquired between 1996 and 2020) from 10 countries. To improve generalization, large-scale chest radiograph pretraining, attention pooling, and semisupervised learning ("noisy-student") were incorporated. The DLS was evaluated in a four-country test set (China, India, the United States, and Zambia) and in a mining population in South Africa, with positive TB confirmed with microbiological tests or nucleic acid amplification testing (NAAT). The performance of the DLS was compared with that of 14 radiologists. The authors studied the efficacy of the DLS compared with that of nine radiologists using the Obuchowski-Rockette-Hillis procedure. Given WHO targets of 90% sensitivity and 70% specificity, the operating point of the DLS (0.45) was prespecified to favor sensitivity. Results A total of 165 754 images in 22 284 subjects (mean age, 45 years; 21% female) were used for model development and testing. In the four-country test set (1236 subjects, 17% with active TB), the receiver operating characteristic (ROC) curve of the DLS was higher than those for all nine India-based radiologists, with an area under the ROC curve of 0.89 (95% CI: 0.87, 0.91). Compared with these radiologists, at the prespecified operating point, the DLS sensitivity was higher (88% vs 75%, P < .001) and specificity was noninferior (79% vs 84%, P = .004). Trends were similar within other patient subgroups, in the South Africa data set, and across various TB-specific chest radiograph findings. In simulations, the use of the DLS to identify likely TB-positive chest radiographs for NAAT confirmation reduced the cost by 40%-80% per TB-positive patient detected. Conclusion A deep learning method was found to be noninferior to radiologists for the determination of active tuberculosis on digital chest radiographs. © RSNA, 2022 Online supplemental material is available for this article. See also the editorial by van Ginneken in this issue.


Asunto(s)
Aprendizaje Profundo , Tuberculosis Pulmonar , Humanos , Femenino , Persona de Mediana Edad , Masculino , Radiografía Torácica/métodos , Estudios Retrospectivos , Radiografía , Tuberculosis Pulmonar/diagnóstico por imagen , Radiólogos , Sensibilidad y Especificidad
5.
Sci Rep ; 11(1): 15523, 2021 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-34471144

RESUMEN

Chest radiography (CXR) is the most widely-used thoracic clinical imaging modality and is crucial for guiding the management of cardiothoracic conditions. The detection of specific CXR findings has been the main focus of several artificial intelligence (AI) systems. However, the wide range of possible CXR abnormalities makes it impractical to detect every possible condition by building multiple separate systems, each of which detects one or more pre-specified conditions. In this work, we developed and evaluated an AI system to classify CXRs as normal or abnormal. For training and tuning the system, we used a de-identified dataset of 248,445 patients from a multi-city hospital network in India. To assess generalizability, we evaluated our system using 6 international datasets from India, China, and the United States. Of these datasets, 4 focused on diseases that the AI was not trained to detect: 2 datasets with tuberculosis and 2 datasets with coronavirus disease 2019. Our results suggest that the AI system trained using a large dataset containing a diverse array of CXR abnormalities generalizes to new patient populations and unseen diseases. In a simulated workflow where the AI system prioritized abnormal cases, the turnaround time for abnormal cases reduced by 7-28%. These results represent an important step towards evaluating whether AI can be safely used to flag cases in a general setting where previously unseen abnormalities exist. Lastly, to facilitate the continued development of AI models for CXR, we release our collected labels for the publicly available dataset.


Asunto(s)
COVID-19/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Tuberculosis/diagnóstico por imagen , Adulto , Anciano , Algoritmos , Estudios de Casos y Controles , China , Aprendizaje Profundo , Femenino , Humanos , India , Masculino , Persona de Mediana Edad , Radiografía Torácica , Estados Unidos
6.
Nat Med ; 25(8): 1319, 2019 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-31253948

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

7.
Nat Med ; 25(6): 954-961, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31110349

RESUMEN

With an estimated 160,000 deaths in 2018, lung cancer is the most common cause of cancer death in the United States1. Lung cancer screening using low-dose computed tomography has been shown to reduce mortality by 20-43% and is now included in US screening guidelines1-6. Existing challenges include inter-grader variability and high false-positive and false-negative rates7-10. We propose a deep learning algorithm that uses a patient's current and prior computed tomography volumes to predict the risk of lung cancer. Our model achieves a state-of-the-art performance (94.4% area under the curve) on 6,716 National Lung Cancer Screening Trial cases, and performs similarly on an independent clinical validation set of 1,139 cases. We conducted two reader studies. When prior computed tomography imaging was not available, our model outperformed all six radiologists with absolute reductions of 11% in false positives and 5% in false negatives. Where prior computed tomography imaging was available, the model performance was on-par with the same radiologists. This creates an opportunity to optimize the screening process via computer assistance and automation. While the vast majority of patients remain unscreened, we show the potential for deep learning models to increase the accuracy, consistency and adoption of lung cancer screening worldwide.


Asunto(s)
Aprendizaje Profundo , Diagnóstico por Computador/métodos , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico , Tamizaje Masivo/métodos , Tomografía Computarizada por Rayos X , Algoritmos , Bases de Datos Factuales , Aprendizaje Profundo/estadística & datos numéricos , Diagnóstico por Computador/estadística & datos numéricos , Humanos , Imagenología Tridimensional/estadística & datos numéricos , Tamizaje Masivo/estadística & datos numéricos , Redes Neurales de la Computación , Estudios Retrospectivos , Factores de Riesgo , Tomografía Computarizada por Rayos X/estadística & datos numéricos , Estados Unidos
8.
Radiology ; 266(3): 812-21, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23220891

RESUMEN

PURPOSE: To compare the inter- and intraobserver variability with manual region of interest (ROI) placement versus that with software-assisted semiautomatic lesion segmentation and histogram analysis with respect to quantitative dynamic contrast material-enhanced (DCE) MR imaging determinations of the volume transfer constant (K(trans)). MATERIALS AND METHODS: The study was approved by the institutional review board and compliant with HIPAA. The requirement to obtain informed consent was waived. Fifteen DCE MR imaging studies of the female pelvis defined the study group. Uterine fibroids were used as a perfusion model. Three varying types of lesion measurements were performed by five readers on each study by using DCE MR imaging perfusion analysis software with manual ROI placement and a semiautomatic lesion segmentation and histogram analysis solution. Intra- and interreader variability of measurements of K(trans) with the different measurement types was calculated. RESULTS: The overall interobserver variability of K(trans) with manual ROI placement (mean, 28.5% ± 9.3) was reduced by 42.5% when the semiautomatic, software-assisted lesion measurement method was used (16.4% ± 6.2). Whole-lesion measurement showed the lowest interobserver variability with both measurement methods (20.1% ± 4.3 with the manual method vs 10.8% ± 2.6 with the semiautomatic method). The overall intrareader variability with the manual ROI method (7.6% ± 10.6) was not significantly different from that with the semiautomatic method (7.3% ± 10.8), but the intraclass correlation coefficient for intrareader reproducibility improved from 0.86 overall with the manual method to 0.99 with the semiautomatic method. CONCLUSION: A semiautomatic lesion segmentation and histogram analysis approach can provide a significant reduction in interobserver variability for DCE MR imaging measurements of K(trans) when compared with manual ROI methods, whereas intraobserver reproducibility is improved to some extent.


Asunto(s)
Medios de Contraste/farmacocinética , Leiomioma/metabolismo , Leiomioma/patología , Angiografía por Resonancia Magnética/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Neoplasias Uterinas/metabolismo , Neoplasias Uterinas/patología , Adulto , Inteligencia Artificial , Simulación por Computador , Femenino , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Persona de Mediana Edad , Modelos Biológicos , Pelvis/patología , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
9.
Radiology ; 265(3): 790-8, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23175544

RESUMEN

PURPOSE: To compare histogram analysis of voxel-based whole-lesion (WL) enhancement to qualitative assessment and region-of-interest (ROI)-based enhancement analysis in discriminating the renal cell cancer (RCC) subtype clear cell RCC (ccRCC) from papillary RCC (pRCC). MATERIALS AND METHODS: In this institutional review board-approved, HIPAA-compliant retrospective study, 73 patients underwent magnetic resonance (MR) imaging prior to surgery for RCC between January 2007 and January 2010. Three-dimensional fat-suppressed T1-weighted gradient-echo corticomedullary phase acquisitions, obtained before and after contrast agent administration, were transferred to a workstation at which automated registration followed by semiautomated segmentation of the RCC was performed. Percent enhancement was computed on a per-voxel basis: (SI(post) - SI(pre))/SI(pre) .100, where SI(pre) and SI(post) indicate signal intensity before and after contrast enhancement, respectively. The WL quantitative parameters of mean, median, and third quartile enhancement and histogram distribution parameters kurtosis and skewness were computed for each lesion. WL enhancement parameters were compared with ROI-based analysis and qualitative assessment with regards to diagnostic accuracy and interreader agreement in differentiating ccRCC from pRCC. RESULTS: There were 19 pRCCs and 55 ccRCCs at pathologic examination. ccRCC had significantly higher WL mean, median, and third quartile enhancement compared with pRCC and hade significantly lower kurtosis and skewness (all P < .001). Third quartile enhancement had the highest accuracy (94.6%; area under the curve, 0.980) in discriminating ccRCC from pRCC, which was significantly higher than the accuracy of qualitative assessment (86.0%; P = .04) but not significantly higher than that of ROI enhancement (89.2%; P = .52). WL enhancement parameters had higher interreader agreement (κ = 0.91-1.0) compared with ROI enhancement or qualitative assessment (κ = 0.83 and 0.7, respectively) in discriminating ccRCC from pRCC. CONCLUSION: WL enhancement histogram analysis is feasible and can potentially be used to differentiate ccRCC from pRCC with high accuracy. SUPPLEMENTAL MATERIAL: http://radiology.rsna.org/lookup/suppl/doi:10.1148/radiol.12111281/-/DC1.


Asunto(s)
Carcinoma Papilar/diagnóstico , Carcinoma de Células Renales/diagnóstico , Neoplasias Renales/diagnóstico , Imagen por Resonancia Magnética/métodos , Adulto , Anciano , Anciano de 80 o más Años , Área Bajo la Curva , Carcinoma Papilar/patología , Carcinoma de Células Renales/patología , Medios de Contraste , Diagnóstico Diferencial , Femenino , Gadolinio DTPA , Humanos , Imagenología Tridimensional , Neoplasias Renales/patología , Modelos Logísticos , Masculino , Persona de Mediana Edad , Reconocimiento de Normas Patrones Automatizadas , Reproducibilidad de los Resultados , Estudios Retrospectivos , Estadísticas no Paramétricas
10.
Radiology ; 260(3): 752-61, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21771960

RESUMEN

PURPOSE: To determine the association of early changes in posttreatment apparent diffusion coefficient (ADC) and venous enhancement (VE) with tumor size change after transarterial chemoembolization (TACE) by using an investigational semiautomated software. MATERIALS AND METHODS: This retrospective HIPAA-compliant study was approved by the institutional review board, with waiver of informed consent. Patients underwent magnetic resonance (MR) imaging at 1.5 T before TACE, as well as 1 and 6 months after TACE. Volumetric analysis of change in ADC and VE 1 month after TACE compared with pretreatment values was performed in 48 patients with 71 hepatocellular carcinoma (HCC) lesions. Diagnostic accuracy was evaluated with receiver operating characteristic (ROC) analysis, using tumor response at 6 months according to Response Evaluation Criteria in Solid Tumors (RECIST) and modified RECIST as end points. RESULTS: According to RECIST criteria, 6 months after TACE, 30 HCC lesions showed partial response (PR), 35 showed stable disease (SD), and six showed progressive disease (PD). Increase in ADC and decrease in VE 1 month after TACE were significantly different between PR, SD, and PD. At area under the ROC curve (AUC) analysis of the ADC increase, there was an AUC of 0.78 for distinguishing PR from SD and PD and an AUC of 0.89 for distinguishing PR and SD from PD. The AUC for decrease in VE was 0.73 for discrimination of PR from SD and PD and 0.90 for discrimination of PR and SD from PD. CONCLUSION: Volumetric analysis of increase in ADC and decrease in VE 1 month after TACE can provide an early assessment of response to treatment. Volumetric analysis of multiparametric MR imaging data may have potential as a prognostic biomarker for patients undergoing local-regional treatment of liver cancer.


Asunto(s)
Carcinoma Hepatocelular/diagnóstico , Imagen de Difusión por Resonancia Magnética/métodos , Gadolinio DTPA , Aumento de la Imagen/métodos , Imagenología Tridimensional/métodos , Hepatopatías/diagnóstico , Hepatopatías/fisiopatología , Pruebas de Función Hepática/métodos , Neoplasias Hepáticas/diagnóstico , Imagen por Resonancia Magnética/métodos , Anciano , Anciano de 80 o más Años , Medios de Contraste , Femenino , Humanos , Hepatopatías/patología , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA