Search | VHL Regional Portal

Deciphering clinical abbreviations with a privacy protecting machine learning system.

Rajkomar, Alvin; Loreaux, Eric; Liu, Yuchen; Kemp, Jonas; Li, Benny; Chen, Ming-Jun; Zhang, Yi; Mohiuddin, Afroz; Gottweis, Juraj.

Nat Commun ; 13(1): 7456, 2022 12 02.

Article in English | MEDLINE | ID: mdl-36460656

ABSTRACT

Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing "HIT" for "heparin induced thrombocytopenia"), ambiguous terms that require expertise to disambiguate (using "MS" for "multiple sclerosis" or "mental status"), or domain-specific vernacular ("cb" for "complicated by"). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data.

Subject(s)

Multiple Sclerosis , Physicians , Thrombocytopenia , Humans , Privacy , Machine Learning , Writing

User-centred design for machine learning in health care: a case study from care management.

Seneviratne, Martin G; Li, Ron C; Schreier, Meredith; Lopez-Martinez, Daniel; Patel, Birju S; Yakubovich, Alex; Kemp, Jonas B; Loreaux, Eric; Gamble, Paul; El-Khoury, Kristel; Vardoulakis, Laura; Wong, Doris; Desai, Janjri; Chen, Jonathan H; Morse, Keith E; Downing, N Lance; Finger, Lutz T; Chen, Ming-Jun; Shah, Nigam.

BMJ Health Care Inform ; 29(1)2022 Oct.

Article in English | MEDLINE | ID: mdl-36220304

ABSTRACT

OBJECTIVES: Few machine learning (ML) models are successfully deployed in clinical practice. One of the common pitfalls across the field is inappropriate problem formulation: designing ML to fit the data rather than to address a real-world clinical pain point. METHODS: We introduce a practical toolkit for user-centred design consisting of four questions covering: (1) solvable pain points, (2) the unique value of ML (eg, automation and augmentation), (3) the actionability pathway and (4) the model's reward function. This toolkit was implemented in a series of six participatory design workshops with care managers in an academic medical centre. RESULTS: Pain points amenable to ML solutions included outpatient risk stratification and risk factor identification. The endpoint definitions, triggering frequency and evaluation metrics of the proposed risk scoring model were directly influenced by care manager workflows and real-world constraints. CONCLUSIONS: Integrating user-centred design early in the ML life cycle is key for configuring models in a clinically actionable way. This toolkit can guide problem selection and influence choices about the technical setup of the ML problem.

Subject(s)

Machine Learning , User-Centered Design , Delivery of Health Care , Humans , Pain , Workflow

Multitask prediction of organ dysfunction in the intensive care unit using sequential subnetwork routing.

Roy, Subhrajit; Mincu, Diana; Loreaux, Eric; Mottram, Anne; Protsyuk, Ivan; Harris, Natalie; Xue, Yuan; Schrouff, Jessica; Montgomery, Hugh; Connell, Alistair; Tomasev, Nenad; Karthikesalingam, Alan; Seneviratne, Martin.

J Am Med Inform Assoc ; 28(9): 1936-1946, 2021 08 13.

Article in English | MEDLINE | ID: mdl-34151965

ABSTRACT

OBJECTIVE: Multitask learning (MTL) using electronic health records allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however, it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential subnetwork routing (SeqSNR) architecture that uses soft parameter sharing to find related tasks and encourage cross-learning between them. MATERIALS AND METHODS: Using the MIMIC-III (Medical Information Mart for Intensive Care-III) dataset, we train deep neural network models to predict the onset of 6 endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single-task (ST) models with naive multitask and SeqSNR in terms of discriminative performance and label efficiency. RESULTS: SeqSNR showed a modest yet statistically significant performance boost across 4 of 6 tasks compared with ST and naive multitasking. When the size of the training dataset was reduced for a given task (label efficiency), SeqSNR outperformed ST for all cases showing an average area under the precision-recall curve boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels, respectively. CONCLUSIONS: The SeqSNR architecture shows superior label efficiency compared with ST and naive multitasking, suggesting utility in scenarios in which endpoint labels are difficult to ascertain.

Subject(s)

Machine Learning , Multiple Organ Failure , Electronic Health Records , Humans , Intensive Care Units , Neural Networks, Computer

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL