Search | VHL Regional Portal

Show: 20 | 50 | 100

Results 1 - 10 de 10

Filter

Simultaneous alleviation of verification and reference standard biases in a community-based tuberculosis screening study using Bayesian latent class analysis.

Keter, Alfred Kipyegon; Vanobberghen, Fiona; Lynen, Lutgarde; Van Heerden, Alastair; Fehr, Jana; Olivier, Stephen; Wong, Emily B; Glass, Tracy R; Reither, Klaus; Goetghebeur, Els; Jacobs, Bart K M.

PLoS One ; 19(6): e0305126, 2024.

Article in English | MEDLINE | ID: mdl-38857227

ABSTRACT

BACKGROUND: Estimation of prevalence and diagnostic test accuracy in tuberculosis (TB) prevalence surveys suffer from reference standard and verification biases. The former is attributed to the imperfect reference test used to bacteriologically confirm TB disease. The latter occurs when only the participants screening positive for any TB-compatible symptom or chest X-ray abnormality are selected for bacteriological testing (verification). Bayesian latent class analysis (LCA) alleviates the reference standard bias but suffers verification bias in TB prevalence surveys. This work aims to identify best-practice approaches to simultaneously alleviate the reference standard and verification biases in the estimates of pulmonary TB prevalence and diagnostic test performance in TB prevalence surveys. METHODS: We performed a secondary analysis of 9869 participants aged ≥15 years from a community-based multimorbidity screening study in a rural district of KwaZulu-Natal, South Africa (Vukuzazi study). Participants were eligible for bacteriological testing using Xpert Ultra and culture if they reported any cardinal TB symptom or had an abnormal chest X-ray finding. We conducted Bayesian LCA in five ways to handle the unverified individuals: (i) complete-case analysis, (ii) analysis assuming the unverified individuals would be negative if bacteriologically tested, (iii) analysis of multiply-imputed datasets with imputation of the missing bacteriological test results for the unverified individuals using multivariate imputation via chained equations (MICE), and simultaneous imputation of the missing bacteriological test results in the analysis model assuming the missing bacteriological test results were (iv) missing at random (MAR), and (v) missing not at random (MNAR). We compared the results of (i)-(iii) to the analysis based on a composite reference standard (CRS) of Xpert Ultra and culture. Through simulation with an overall true prevalence of 2.0%, we evaluated the ability of the models to alleviate both biases simultaneously. RESULTS: Based on simulation, Bayesian LCA with simultaneous imputation of the missing bacteriological test results under the assumption that the missing data are MAR and MNAR alleviate the reference standard and verification biases. CRS-based analysis and Bayesian LCA assuming the unverified are negative for TB alleviate the biases only when the true overall prevalence is <3.0%. Complete-case analysis produced biased estimates. In the Vukuzazi study, Bayesian LCA with simultaneous imputation of the missing bacteriological test results under the MAR and MNAR assumptions produced overall PTB prevalence of 0.9% (95% Credible Interval (CrI): 0.6-1.9) and 0.7% (95% CrI: 0.5-1.1) respectively alongside realistic estimates of overall diagnostic test sensitivity and specificity with substantially overlapping 95% CrI. The CRS-based analysis and Bayesian LCA assuming the unverified were negative for TB produced 0.7% (95% CrI: 0.5-0.9) and 0.7% (95% CrI: 0.5-1.2) overall PTB prevalence respectively with realistic estimates of overall diagnostic test sensitivity and specificity. Unlike CRS-based analysis, Bayesian LCA of multiply-imputed data using MICE mitigates both biases. CONCLUSION: The findings demonstrate the efficacy of these advanced techniques in alleviating the reference standard and verification biases, enhancing the robustness of community-based screening programs. Imputing missing values as negative for bacteriological tests is plausible under realistic assumptions.

Subject(s)

Bayes Theorem , Latent Class Analysis , Mass Screening , Reference Standards , Humans , Adult , Female , South Africa/epidemiology , Male , Mass Screening/standards , Mass Screening/methods , Prevalence , Middle Aged , Bias , Tuberculosis, Pulmonary/diagnosis , Tuberculosis, Pulmonary/epidemiology , Adolescent , Young Adult , Aged

The unmet promise of trustworthy AI in healthcare: why we fail at clinical translation.

Bürger, Valerie K; Amann, Julia; Bui, Cathrine K T; Fehr, Jana; Madai, Vince I.

Front Digit Health ; 6: 1279629, 2024.

Article in English | MEDLINE | ID: mdl-38698888

ABSTRACT

Artificial intelligence (AI) has the potential to revolutionize healthcare, for example via decision support systems, computer vision approaches, or AI-based prevention tools. Initial results from AI applications in healthcare show promise but are rarely translated into clinical practice successfully and ethically. This occurs despite an abundance of "Trustworthy AI" guidelines. How can we explain the translational gaps of AI in healthcare? This paper offers a fresh perspective on this problem, showing that failing translation of healthcare AI markedly arises from a lack of an operational definition of "trust" and "trustworthiness". This leads to (a) unintentional misuse concerning what trust (worthiness) is and (b) the risk of intentional abuse by industry stakeholders engaging in ethics washing. By pointing out these issues, we aim to highlight the obstacles that hinder translation of Trustworthy medical AI to practice and prevent it from fulfilling its unmet promises.

Simulating rigid head motion artifacts on brain magnitude MRI data-Outcome on image quality and segmentation of the cerebral cortex.

Olsson, Hampus; Millward, Jason Michael; Starke, Ludger; Gladytz, Thomas; Klein, Tobias; Fehr, Jana; Lai, Wei-Chang; Lippert, Christoph; Niendorf, Thoralf; Waiczies, Sonia.

PLoS One ; 19(4): e0301132, 2024.

Article in English | MEDLINE | ID: mdl-38626138

ABSTRACT

Magnetic Resonance Imaging (MRI) datasets from epidemiological studies often show a lower prevalence of motion artifacts than what is encountered in clinical practice. These artifacts can be unevenly distributed between subject groups and studies which introduces a bias that needs addressing when augmenting data for machine learning purposes. Since unreconstructed multi-channel k-space data is typically not available for population-based MRI datasets, motion simulations must be performed using signal magnitude data. There is thus a need to systematically evaluate how realistic such magnitude-based simulations are. We performed magnitude-based motion simulations on a dataset (MR-ART) from 148 subjects in which real motion-corrupted reference data was also available. The similarity of real and simulated motion was assessed by using image quality metrics (IQMs) including Coefficient of Joint Variation (CJV), Signal-to-Noise-Ratio (SNR), and Contrast-to-Noise-Ratio (CNR). An additional comparison was made by investigating the decrease in the Dice-Sørensen Coefficient (DSC) of automated segmentations with increasing motion severity. Segmentation of the cerebral cortex was performed with 6 freely available tools: FreeSurfer, BrainSuite, ANTs, SAMSEG, FastSurfer, and SynthSeg+. To better mimic the real subject motion, the original motion simulation within an existing data augmentation framework (TorchIO), was modified. This allowed a non-random motion paradigm and phase encoding direction. The mean difference in CJV/SNR/CNR between the real motion-corrupted images and our modified simulations (0.004±0.054/-0.7±1.8/-0.09±0.55) was lower than that of the original simulations (0.015±0.061/0.2±2.0/-0.29±0.62). Further, the mean difference in the DSC between the real motion-corrupted images was lower for our modified simulations (0.03±0.06) compared to the original simulations (-0.15±0.09). SynthSeg+ showed the highest robustness towards all forms of motion, real and simulated. In conclusion, reasonably realistic synthetic motion artifacts can be induced on a large-scale when only magnitude MR images are available to obtain unbiased data sets for the training of machine learning based models.

Subject(s)

Artifacts , Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging/methods , Motion , Brain/diagnostic imaging , Cerebral Cortex , Image Processing, Computer-Assisted/methods

A trustworthy AI reality-check: the lack of transparency of artificial intelligence products in healthcare.

Fehr, Jana; Citro, Brian; Malpani, Rohit; Lippert, Christoph; Madai, Vince I.

Front Digit Health ; 6: 1267290, 2024.

Article in English | MEDLINE | ID: mdl-38455991

ABSTRACT

Trustworthy medical AI requires transparency about the development and testing of underlying algorithms to identify biases and communicate potential risks of harm. Abundant guidance exists on how to achieve transparency for medical AI products, but it is unclear whether publicly available information adequately informs about their risks. To assess this, we retrieved public documentation on the 14 available CE-certified AI-based radiology products of the II b risk category in the EU from vendor websites, scientific publications, and the European EUDAMED database. Using a self-designed survey, we reported on their development, validation, ethical considerations, and deployment caveats, according to trustworthy AI guidelines. We scored each question with either 0, 0.5, or 1, to rate if the required information was "unavailable", "partially available," or "fully available." The transparency of each product was calculated relative to all 55 questions. Transparency scores ranged from 6.4% to 60.9%, with a median of 29.1%. Major transparency gaps included missing documentation on training data, ethical considerations, and limitations for deployment. Ethical aspects like consent, safety monitoring, and GDPR-compliance were rarely documented. Furthermore, deployment caveats for different demographics and medical settings were scarce. In conclusion, public documentation of authorized medical AI products in Europe lacks sufficient public transparency to inform about safety and risks. We call on lawmakers and regulators to establish legally mandated requirements for public and substantive transparency to fulfill the promise of trustworthy AI for health.

Assessing the transportability of clinical prediction models for cognitive impairment using causal models.

Fehr, Jana; Piccininni, Marco; Kurth, Tobias; Konigorski, Stefan.

BMC Med Res Methodol ; 23(1): 187, 2023 08 19.

Article in English | MEDLINE | ID: mdl-37598141

ABSTRACT

BACKGROUND: Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics. METHODS: We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE Îµ4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC). RESULTS: Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC. CONCLUSIONS: We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.

Subject(s)

Cognitive Dysfunction , Models, Statistical , Humans , Prognosis , Cognitive Dysfunction/diagnosis , Benchmarking , Calibration

Piloting a Survey-Based Assessment of Transparency and Trustworthiness with Three Medical AI Tools.

Fehr, Jana; Jaramillo-Gutierrez, Giovanna; Oala, Luis; Gröschel, Matthias I; Bierwirth, Manuel; Balachandran, Pradeep; Werneck-Leite, Alixandro; Lippert, Christoph.

Healthcare (Basel) ; 10(10)2022 Sep 30.

Article in English | MEDLINE | ID: mdl-36292369

ABSTRACT

Artificial intelligence (AI) offers the potential to support healthcare delivery, but poorly trained or validated algorithms bear risks of harm. Ethical guidelines stated transparency about model development and validation as a requirement for trustworthy AI. Abundant guidance exists to provide transparency through reporting, but poorly reported medical AI tools are common. To close this transparency gap, we developed and piloted a framework to quantify the transparency of medical AI tools with three use cases. Our framework comprises a survey to report on the intended use, training and validation data and processes, ethical considerations, and deployment recommendations. The transparency of each response was scored with either 0, 0.5, or 1 to reflect if the requested information was not, partially, or fully provided. Additionally, we assessed on an analogous three-point scale if the provided responses fulfilled the transparency requirement for a set of trustworthiness criteria from ethical guidelines. The degree of transparency and trustworthiness was calculated on a scale from 0% to 100%. Our assessment of three medical AI use cases pin-pointed reporting gaps and resulted in transparency scores of 67% for two use cases and one with 59%. We report anecdotal evidence that business constraints and limited information from external datasets were major obstacles to providing transparency for the three use cases. The observed transparency gaps also lowered the degree of trustworthiness, indicating compliance gaps with ethical guidelines. All three pilot use cases faced challenges to provide transparency about medical AI tools, but more studies are needed to investigate those in the wider medical AI sector. Applying this framework for an external assessment of transparency may be infeasible if business constraints prevent the disclosure of information. New strategies may be necessary to enable audits of medical AI tools while preserving business secrets.

Machine Learning for Health: Algorithm Auditing & Quality Control.

Oala, Luis; Murchison, Andrew G; Balachandran, Pradeep; Choudhary, Shruti; Fehr, Jana; Leite, Alixandro Werneck; Goldschmidt, Peter G; Johner, Christian; Schörverth, Elora D M; Nakasi, Rose; Meyer, Martin; Cabitza, Federico; Baird, Pat; Prabhu, Carolin; Weicken, Eva; Liu, Xiaoxuan; Wenzel, Markus; Vogler, Steffen; Akogo, Darlington; Alsalamah, Shada; Kazim, Emre; Koshiyama, Adriano; Piechottka, Sven; Macpherson, Sheena; Shadforth, Ian; Geierhofer, Regina; Matek, Christian; Krois, Joachim; Sanguinetti, Bruno; Arentz, Matthew; Bielik, Pavol; Calderon-Ramirez, Saul; Abbood, Auss; Langer, Nicolas; Haufe, Stefan; Kherif, Ferath; Pujari, Sameer; Samek, Wojciech; Wiegand, Thomas.

J Med Syst ; 45(12): 105, 2021 Nov 02.

Article in English | MEDLINE | ID: mdl-34729675

ABSTRACT

Developers proposing new machine learning for health (ML4H) tools often pledge to match or even surpass the performance of existing tools, yet the reality is usually more complicated. Reliable deployment of ML4H to the real world is challenging as examples from diabetic retinopathy or Covid-19 screening show. We envision an integrated framework of algorithm auditing and quality control that provides a path towards the effective and reliable application of ML systems in healthcare. In this editorial, we give a summary of ongoing work towards that vision and announce a call for participation to the special issue Machine Learning for Health: Algorithm Auditing & Quality Control in this journal to advance the practice of ML4H auditing.

Subject(s)

Algorithms , Machine Learning , Quality Control , Humans

Publisher Correction: Computer-aided interpretation of chest radiography reveals the spectrum of tuberculosis in rural South Africa.

Fehr, Jana; Konigorski, Stefan; Olivier, Stephen; Gunda, Resign; Surujdeen, Ashmika; Gareta, Dickman; Smit, Theresa; Baisley, Kathy; Moodley, Sashen; Moosa, Yumna; Hanekom, Willem; Koole, Olivier; Ndung'u, Thumbi; Pillay, Deenan; Grant, Alison D; Siedner, Mark J; Lippert, Christoph; Wong, Emily B.

NPJ Digit Med ; 4(1): 115, 2021 Jul 16.

Article in English | MEDLINE | ID: mdl-34272478

Computer-aided interpretation of chest radiography reveals the spectrum of tuberculosis in rural South Africa.

NPJ Digit Med ; 4(1): 106, 2021 Jul 02.

Article in English | MEDLINE | ID: mdl-34215836

ABSTRACT

Computer-aided digital chest radiograph interpretation (CAD) can facilitate high-throughput screening for tuberculosis (TB), but its use in population-based active case-finding programs has been limited. In an HIV-endemic area in rural South Africa, we used a CAD algorithm (CAD4TBv5) to interpret digital chest x-rays (CXR) as part of a mobile health screening effort. Participants with TB symptoms or CAD4TBv5 score above the triaging threshold were referred for microbiological sputum assessment. During an initial pilot phase, a low CAD4TBv5 triaging threshold of 25 was selected to maximize TB case finding. We report the performance of CAD4TBv5 in screening 9,914 participants, 99 (1.0%) of whom were found to have microbiologically proven TB. CAD4TBv5 was able to identify TB cases at the same sensitivity but lower specificity as a blinded radiologist, whereas the next generation of the algorithm (CAD4TBv6) achieved comparable sensitivity and specificity to the radiologist. The CXRs of people with microbiologically confirmed TB spanned a range of lung field abnormality, including 19 (19.2%) cases deemed normal by the radiologist. HIV serostatus did not impact CAD4TB's performance. Notably, 78.8% of the TB cases identified during this population-based survey were asymptomatic and therefore triaged for sputum collection on the basis of CAD4TBv5 score alone. While CAD4TBv6 has the potential to replace radiologists for triaging CXRs in TB prevalence surveys, population-specific piloting is necessary to set the appropriate triaging thresholds. Further work on image analysis strategies is needed to identify radiologically subtle active TB.

10.

Experimental and computational analyses reveal that environmental restrictions shape HIV-1 spread in 3D cultures.

Imle, Andrea; Kumberger, Peter; Schnellbächer, Nikolas D; Fehr, Jana; Carrillo-Bustamante, Paola; Ales, Janez; Schmidt, Philip; Ritter, Christian; Godinez, William J; Müller, Barbara; Rohr, Karl; Hamprecht, Fred A; Schwarz, Ulrich S; Graw, Frederik; Fackler, Oliver T.

Nat Commun ; 10(1): 2144, 2019 05 13.

Article in English | MEDLINE | ID: mdl-31086185

ABSTRACT

Pathogens face varying microenvironments in vivo, but suitable experimental systems and analysis tools to dissect how three-dimensional (3D) tissue environments impact pathogen spread are lacking. Here we develop an Integrative method to Study Pathogen spread by Experiment and Computation within Tissue-like 3D cultures (INSPECT-3D), combining quantification of pathogen replication with imaging to study single-cell and cell population dynamics. We apply INSPECT-3D to analyze HIV-1 spread between primary human CD4 T-lymphocytes using collagen as tissue-like 3D-scaffold. Measurements of virus replication, infectivity, diffusion, cellular motility and interactions are combined by mathematical analyses into an integrated spatial infection model to estimate parameters governing HIV-1 spread. This reveals that environmental restrictions limit infection by cell-free virions but promote cell-associated HIV-1 transmission. Experimental validation identifies cell motility and density as essential determinants of efficacy and mode of HIV-1 spread in 3D. INSPECT-3D represents an adaptable method for quantitative time-resolved analyses of 3D pathogen spread.

Subject(s)

CD4-Positive T-Lymphocytes/virology , HIV-1/pathogenicity , Models, Biological , Primary Cell Culture/methods , Virus Physiological Phenomena , CD4-Positive T-Lymphocytes/physiology , Cell Movement , Cells, Cultured , Computer Simulation , HEK293 Cells , HIV-1/physiology , Healthy Volunteers , Humans

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL