Your browser doesn't support javascript.
Public Covid-19 X-ray datasets and their impact on model bias - A systematic review of a significant problem.
Garcia Santa Cruz, Beatriz; Bossa, Matías Nicolás; Sölter, Jan; Husch, Andreas Dominik.
  • Garcia Santa Cruz B; Centre Hospitalier de Luxembourg, 4, Rue Ernest Barble, Luxembourg L-1210, Luxembourg; Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, Avenue des Hauts Fourneaux, Esch-sur-Alzette L-4362, Luxembourg. Electronic address: beatriz.garcia@ext.uni.lu.
  • Bossa MN; Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, Avenue des Hauts Fourneaux, Esch-sur-Alzette L-4362, Luxembourg; Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel (VUB), Pleinlaan 2, Brussels B-1050, Belgium.
  • Sölter J; Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, Avenue des Hauts Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
  • Husch AD; Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, Avenue des Hauts Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
Med Image Anal ; 74: 102225, 2021 12.
Article in English | MEDLINE | ID: covidwho-1440260
Preprint
This scientific journal article is probably based on a previously available preprint. It has been identified through a machine matching algorithm, human confirmation is still pending.
See preprint
ABSTRACT
Computer-aided-diagnosis and stratification of COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and improper consideration of confounders prevents the translation of prediction models into clinical practice. By adopting established tools for model evaluation to the task of evaluating datasets, this study provides a systematic appraisal of publicly available COVID-19 chest X-ray datasets, determining their potential use and evaluating potential sources of bias. Only 9 out of more than a hundred identified datasets met at least the criteria for proper assessment of risk of bias and could be analysed in detail. Remarkably most of the datasets utilised in 201 papers published in peer-reviewed journals, are not among these 9 datasets, thus leading to models with high risk of bias. This raises concerns about the suitability of such models for clinical use. This systematic review highlights the limited description of datasets employed for modelling and aids researchers to select the most suitable datasets for their task.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Diagnostic study / Experimental Studies / Prognostic study / Reviews / Systematic review/Meta Analysis Limits: Humans Language: English Journal: Med Image Anal Journal subject: Diagnostic Imaging Year: 2021 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Diagnostic study / Experimental Studies / Prognostic study / Reviews / Systematic review/Meta Analysis Limits: Humans Language: English Journal: Med Image Anal Journal subject: Diagnostic Imaging Year: 2021 Document Type: Article