A Large Language Modelling Deep Learning Framework for the Next Pandemic (preprint)

Tingting Zhu; Xian Wu; Bang Yang; Chenyu You; Chenyang Wang; Lei Lu; Zhangdaihong Liu; Yefeng Zheng; Xu Sun; Yang Yang; David Clifton; Fenglin Liu

This article is a Preprint

Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.

A Large Language Modelling Deep Learning Framework for the Next Pandemic (preprint)

Tingting Zhu; Xian Wu; Bang Yang; Chenyu You; Chenyang Wang; Lei Lu; Zhangdaihong Liu; Yefeng Zheng; Xu Sun; Yang Yang; David Clifton; Fenglin Liu.

researchsquare; 2023.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-2777372.v1

ABSTRACT

ABSTRACT

Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Typical applications include 1) medical report generation, 2) disease classification, and 3) survival prediction. Since most neural networks are supervised, their quality heavily depends on the volume and quality of available labels. However, for novel diseases, e.g., new pandemics or new variants, there are few existing labels. In addition, the acquisition of new pandemic cases to collect sufficient labels for training is time-consuming and is typically unavailable at the early stage. To prepare neural networks for the next pandemic, in this paper, we propose a large language model - Unsupervised Learning from Unlabelled Medical Images and Text (ULUMIT) framework, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering new pandemics, our framework can be rapidly deployed and easily adapted to them with extremely limited labels. Furthermore, ULUMIT supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for any clinical task that involves both visual and textual medical data. We demonstrate the effectiveness of our ULUMIT by showing how it would perform using the COVID-19 pandemic ``in replay''. In particular, in the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input medical data, image-only, text-only, and image-text; 2) three kinds of downstream tasks, medical reporting, diagnosis, and prognosis; 3) five public COVID-19 datasets; and 4) three different languages, i.e., English, Chinese, and Spanish. All experiments consistently show that our framework can make accurate and robust COVID-19 decision-support tasks with little labelled data (such as considering information from only one patient), providing an impact on medical data analysis during the early stage of the next pandemic. Besides COVID-19, our framework can be applied to identify 14 common thorax diseases and tuberculosis across five additional public datasets, demonstrating its robustness in generalization and transferability. In brief, our framework achieves state-of-the-art performances on ten datasets.

Subject(s)

Language Disorders; Tuberculosis; COVID-19

Fulltext

XML

Search on Google

Full text: Available Collection: Preprints Database: PREPRINT-RESEARCHSQUARE Main subject: Tuberculosis / COVID-19 / Language Disorders Language: English Year: 2023 Document Type: Preprint

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google