RadImageNet: A Large-scale Radiologic Dataset for Enhancing Deep Learning Transfer Learning Research (preprint)

Yang Yang; Xueyan Mei; Philip Robson; Brett Marinelli; Mingqian Huang; Amish Doshi; Adam Jacobi; Katherine Link; Thomas Yang; Chendi Cao; Ying Wang; Hayit Greenspan; Timothy Deyer; Zahi Fayad

This article is a Preprint

Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.

RadImageNet: A Large-scale Radiologic Dataset for Enhancing Deep Learning Transfer Learning Research (preprint)

Yang Yang; Xueyan Mei; Philip Robson; Brett Marinelli; Mingqian Huang; Amish Doshi; Adam Jacobi; Katherine Link; Thomas Yang; Chendi Cao; Ying Wang; Hayit Greenspan; Timothy Deyer; Zahi Fayad.

researchsquare; 2021.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-600803.v1

ABSTRACT

ABSTRACT

Most current medical imaging Artificial Intelligence (AI) relies upon transfer learning using convolutional neural networks (CNNs) created using ImageNet, a large database of natural world images, including cats, dogs, and vehicles. Size, diversity, and similarity of the source data determine the success of the transfer learning on the target data. ImageNet is large and diverse, but there is a significant dissimilarity between its natural world images and medical images, leading Cheplygina to pose the question, “Why do we still use images of cats to help Artificial Intelligence interpret CAT scans?”. We present an equally large and diversified database, RadImageNet, consisting of 5 million annotated medical images consisting of CT, MRI, and ultrasound of musculoskeletal, neurologic, oncologic, gastrointestinal, endocrine, and pulmonary pathologies over 450,000 patients. The database is unprecedented in scale and breadth in the medical imaging field, constituting a more appropriate basis for medical imaging transfer learning applications. We found that RadImageNet transfer learning outperformed ImageNet in multiple independent applications, including improvements for bone age prediction from hand and wrist x-rays by 1.75 months (p<0.0001), pneumonia detection in ICU chest x-rays by 0.85% (p<0.0001), ACL tear detection on MRI by 10.72% (p<0.0001), SARS-CoV-2 detection on chest CT by 0.25% (p<0.0001) and hemorrhage detection on head CT by 0.13% (p<0.0001). The results indicate that our pre-trained models that are open-sourced on public domains will be a better starting point for transfer learning in radiologic imaging AI applications, including applications involving medical imaging modalities or anatomies not included in the RadImageNet database.

Fulltext

XML

Search on Google

Full text: Available Collection: Preprints Database: PREPRINT-RESEARCHSQUARE Language: English Year: 2021 Document Type: Preprint

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google

Full text: Available Collection: Preprints Database: PREPRINT-RESEARCHSQUARE Language: English Year: 2021 Document Type: Preprint