Your browser doesn't support javascript.
Using Primary Care Clinical Text Data and Natural Language Processing to Identify Indicators of COVID-19 in Toronto, Canada.
Meaney, Christopher; Moineddin, Rahim; Kalia, Sumeet; Aliarzadeh, Babak; Greiver, Michelle.
  • Meaney C; Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, Canada.
  • Moineddin R; Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, Canada.
  • Kalia S; Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, Canada.
  • Aliarzadeh B; Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, Canada.
  • Greiver M; Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, Toronto, Canada.
PLOS Digit Health ; 1(12): e0000150, 2022 Dec.
Article in English | MEDLINE | ID: covidwho-2271679
ABSTRACT
The objective of this study was to investigate whether a rule-based natural language processing (NLP) system, applied to primary care clinical text data, could be used to monitor COVID-19 viral activity in Toronto, Canada. We employed a retrospective cohort design. We included primary care patients with a clinical encounter between January 1, 2020 and December 31, 2020 at one of 44 participating clinical sites. During the study timeframe, Toronto first experienced a COVID-19 outbreak between March-2020 and June-2020; followed by a second viral resurgence from October-2020 through December-2020. We used an expert derived dictionary, pattern matching tools and contextual analyzer to classify primary care documents as 1) COVID-19 positive, 2) COVID-19 negative, or 3) unknown COVID-19 status. We applied the COVID-19 biosurveillance system across three primary care electronic medical record text streams 1) lab text, 2) health condition diagnosis text and 3) clinical notes. We enumerated COVID-19 entities in the clinical text and estimated the proportion of patients with a positive COVID-19 record. We constructed a primary care COVID-19 NLP-derived time series and investigated its correlation with independent/external public health series 1) lab confirmed COVID-19 cases, 2) COVID-19 hospitalizations, 3) COVID-19 ICU admissions, and 4) COVID-19 intubations. A total of 196,440 unique patients were observed over the study timeframe, of which 4,580 (2.3%) had at least one positive COVID-19 document in their primary care electronic medical record. Our NLP-derived COVID-19 time series describing the temporal dynamics of COVID-19 positivity status over the study timeframe demonstrated a pattern/trend which strongly mirrored that of other external public health series under investigation. We conclude that primary care text data passively collected from electronic medical record systems represent a high quality, low-cost source of information for monitoring/surveilling COVID-19 impacts on community health.

Full text: Available Collection: International databases Database: MEDLINE Type of study: Cohort study / Experimental Studies / Observational study / Prognostic study Language: English Journal: PLOS Digit Health Year: 2022 Document Type: Article Affiliation country: Journal.pdig.0000150

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Type of study: Cohort study / Experimental Studies / Observational study / Prognostic study Language: English Journal: PLOS Digit Health Year: 2022 Document Type: Article Affiliation country: Journal.pdig.0000150