Your browser doesn't support javascript.
Sequestration of Imaging Studies in MIDRC: A Multi-Institutional Data Commons
Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment ; 12035, 2022.
Article in English | Scopus | ID: covidwho-1901882
ABSTRACT
The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered database for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated meta-data will become part of the open repository and 20% will be sequestered and kept separate from the open commons. To ensure that both the public, open dataset and the sequestered dataset are representative of the population available, demographic characteristics across the two datasets must be balanced. Our method uses multidimensional stratified sampling where several demographic variables of interest are sequentially used to separate the data into individual strata, each representing a unique combination of variables. Within each stratum, patients are randomly assigned to the open set (80%) or the sequestered set (20%). Thus, for p variables of interest, the balance of the pdimensional distribution of variable combinations can be controlled. This algorithm was used on an example COVID-19 dataset containing image exams of 4662 patients using the variables of race, age, sex at birth, and ethnicity, each containing 8, 8, 2, and 4 categories, respectively. After stratification of this dataset into the two subsets, resulting distributions of each variable matched the distribution from the original dataset with a maximum percent difference from its original fraction of 0.4%. These results demonstrate that the implemented process of multi-dimensional sequential stratified sampling can partition a large database while maintaining balance across several variables. © 2022 SPIE. All rights reserved.
Keywords

Full text: Available Collection: Databases of international organizations Database: Scopus Language: English Journal: Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment Year: 2022 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: Databases of international organizations Database: Scopus Language: English Journal: Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment Year: 2022 Document Type: Article