Sounds of COVID-19: exploring realistic performance of audio-based digital testing (preprint)

Jing Han; Tong Xia; Dimitris Spathis; Erika Bondareva; Chloë Brown; Jagmohan Chauhan; Ting Dang; Andreas Grammenos; Apinan Hasthanasombat; Andres Floto; Pietro Cicuta; Cecilia Mascolo

This article is a Preprint

Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.

Sounds of COVID-19: exploring realistic performance of audio-based digital testing (preprint)

Jing Han; Tong Xia; Dimitris Spathis; Erika Bondareva; Chloë Brown; Jagmohan Chauhan; Ting Dang; Andreas Grammenos; Apinan Hasthanasombat; Andres Floto; Pietro Cicuta; Cecilia Mascolo.

arxiv; 2021.

Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2106.15523v1

ABSTRACT

ABSTRACT

Researchers have been battling with the question of how we can identify Coronavirus disease (COVID-19) cases efficiently, affordably and at scale. Recent work has shown how audio based approaches, which collect respiratory audio data (cough, breathing and voice) can be used for testing, however there is a lack of exploration of how biases and methodological decisions impact these tools' performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside recent COVID-19 test result and symptoms intended as a ground truth. Within the collected dataset, we selected 5,240 samples from 2,478 participants and split them into different participant-independent sets for model development and validation. Among these, we controlled for potential confounding factors (such as demographics and language). The unbiased model takes features extracted from breathing, coughs, and voice signals as predictors and yields an AUC-ROC of 0.71 (95\% CI 0.65$-$0.77). We further explore different unbalanced distributions to show how biases and participant splits affect performance. Finally, we discuss how the realistic model presented could be integrated in clinical practice to realize continuous, ubiquitous, sustainable and affordable testing at population scale.

Subject(s)

COVID-19; Coronavirus Infections

Fulltext

XML

Search on Google

Full text: Available Collection: Preprints Database: PREPRINT-ARXIV Main subject: Coronavirus Infections / COVID-19 Language: English Year: 2021 Document Type: Preprint

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google

Full text: Available Collection: Preprints Database: PREPRINT-ARXIV Main subject: Coronavirus Infections / COVID-19 Language: English Year: 2021 Document Type: Preprint