Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Preprint in English | EuropePMC | ID: ppcovidwho-295329

ABSTRACT

While we pay attention to the latest advances in clinical natural language processing (NLP), we can notice some resistance in the clinical and translational research community to adopt NLP models due to limited transparency, Interpretability and usability. Built upon our previous work, in this study, we proposed an open natural language processing development framework and evaluated it through the implementation of NLP algorithms for the National COVID Cohort Collaborative (N3C). Based on the interests in information extraction from COVID-19 related clinical notes, our work includes 1) an open data annotation process using COVID-19 signs and symptoms as the use case, 2) a community-driven ruleset composing platform, and 3) a synthetic text data generation workflow to generate texts for information extraction tasks without involving human subjects. The generated corpora derived out of the texts from multiple intuitions and gold standard annotation are tested on a single institution's rule set has the performances in F1 score of 0.876, 0.706 and 0.694, respectively. The study as a consortium effort of the N3C NLP subgroup demonstrates the feasibility of creating a federated NLP algorithm development and benchmarking platform to enhance multi-institution clinical NLP study.

2.
EBioMedicine ; 74: 103722, 2021 Nov 25.
Article in English | MEDLINE | ID: covidwho-1536517

ABSTRACT

BACKGROUND: Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 (PASC or "long COVID"), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations. Patient-led studies are of particular importance for understanding the natural history of COVID-19, but integration is hampered because they often use different terms to describe the same symptom or condition. This significant disparity in patient versus clinical characterization motivated the proposed ontological approach to specifying manifestations, which will improve capture and integration of future long COVID studies. METHODS: The Human Phenotype Ontology (HPO) is a widely used standard for exchange and analysis of phenotypic abnormalities in human disease but has not yet been applied to the analysis of COVID-19. FINDINGS: We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to HPO terms. We present layperson synonyms and definitions that can be used to link patient self-report questionnaires to standard medical terminology. Long COVID clinical manifestations are not assessed consistently across studies, and most manifestations have been reported with a wide range of synonyms by different authors. Across at least 10 cohorts, authors reported 31 unique clinical features corresponding to HPO terms; the most commonly reported feature was Fatigue (median 45.1%) and the least commonly reported was Nausea (median 3.9%), but the reported percentages varied widely between studies. INTERPRETATION: Translating long COVID manifestations into computable HPO terms will improve analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared/pooled more effectively. Furthermore, mapping lay terminology to HPO will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, thereby improving the stratification, diagnosis, and treatment of long COVID. FUNDING: U24TR002306; UL1TR001439; P30AG024832; GBMF4552; R01HG010067; UL1TR002535; K23HL128909; UL1TR002389; K99GM145411 .

3.
JAMIA Open ; 4(3): ooab070, 2021 Jul.
Article in English | MEDLINE | ID: covidwho-1369113

ABSTRACT

Objective: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A potential solution to mitigate these issues is to use the rule-based gazetteer developed at our institution. Materials and Methods: Performance, resource utilization, and runtime of the rule-based gazetteer were compared with five annotation systems: BioMedICUS, cTAKES, MetaMap, CLAMP, and MedTagger. Results: This rule-based gazetteer was the fastest, had a low resource footprint, and similar performance for weighted microaverage and macroaverage measures of precision, recall, and f1-score compared to other annotation systems. Discussion: Opportunities to increase its performance include fine-tuning lexical rules for symptom identification. Additionally, it could run on multiple compute nodes for faster runtime. Conclusion: This rule-based gazetteer overcame key technical limitations facilitating real-time symptomatology identification for COVID-19 and integration of unstructured data elements into our CDS. It is ideal for large-scale deployment across a wide variety of healthcare settings for surveillance of acute COVID-19 symptoms for integration into prognostic modeling. Such a system is currently being leveraged for monitoring of postacute sequelae of COVID-19 (PASC) progression in COVID-19 survivors. This study conducted the first in-depth analysis and developed a rule-based gazetteer for COVID-19 symptom extraction with the following key features: low processor and memory utilization, faster runtime, and similar weighted microaverage and macroaverage measures for precision, recall, and f1-score compared to industry-standard annotation systems.

4.
Mayo Clin Proc ; 96(7): 1890-1895, 2021 07.
Article in English | MEDLINE | ID: covidwho-1202099

ABSTRACT

Predictive models have played a critical role in local, national, and international response to the COVID-19 pandemic. In the United States, health care systems and governmental agencies have relied on several models, such as the Institute for Health Metrics and Evaluation, Youyang Gu (YYG), Massachusetts Institute of Technology, and Centers for Disease Control and Prevention ensemble, to predict short- and long-term trends in disease activity. The Mayo Clinic Bayesian SIR model, recently made publicly available, has informed Mayo Clinic practice leadership at all sites across the United States and has been shared with Minnesota governmental leadership to help inform critical decisions during the past year. One key to the accuracy of the Mayo Clinic model is its ability to adapt to the constantly changing dynamics of the pandemic and uncertainties of human behavior, such as changes in the rate of contact among the population over time and by geographic location and now new virus variants. The Mayo Clinic model can also be used to forecast COVID-19 trends in different hypothetical worlds in which no vaccine is available, vaccinations are no longer being accepted from this point forward, and 75% of the population is already vaccinated. Surveys indicate that half of American adults are hesitant to receive a COVID-19 vaccine, and lack of understanding of the benefits of vaccination is an important barrier to use. The focus of this paper is to illustrate the stark contrast between these 3 scenarios and to demonstrate, mathematically, the benefit of high vaccine uptake on the future course of the pandemic.


Subject(s)
COVID-19 Vaccines , COVID-19/prevention & control , COVID-19/epidemiology , Forecasting , Hospitalization/statistics & numerical data , Hospitalization/trends , Humans , United States/epidemiology
5.
Mayo Clin Proc ; 96(3): 690-698, 2021 03.
Article in English | MEDLINE | ID: covidwho-1002862

ABSTRACT

In March 2020, our institution developed an interdisciplinary predictive analytics task force to provide coronavirus disease 2019 (COVID-19) hospital census forecasting to help clinical leaders understand the potential impacts on hospital operations. As the situation unfolded into a pandemic, our task force provided predictive insights through a structured set of visualizations and key messages that have helped the practice to anticipate and react to changing operational needs and opportunities. The framework shared here for the deployment of a COVID-19 predictive analytics task force could be adapted for effective implementation at other institutions to provide evidence-based messaging for operational decision-making. For hospitals without such a structure, immediate consideration may be warranted in light of the devastating COVID-19 third-wave which has arrived for winter 2020-2021.


Subject(s)
COVID-19/therapy , Decision Making , Disease Management , Hospitals/statistics & numerical data , Intensive Care Units/statistics & numerical data , Pandemics , SARS-CoV-2 , COVID-19/epidemiology , Forecasting , Humans
6.
J Biomed Inform ; 113: 103660, 2021 01.
Article in English | MEDLINE | ID: covidwho-972883

ABSTRACT

Coronavirus Disease 2019 has emerged as a significant global concern, triggering harsh public health restrictions in a successful bid to curb its exponential growth. As discussion shifts towards relaxation of these restrictions, there is significant concern of second-wave resurgence. The key to managing these outbreaks is early detection and intervention, and yet there is a significant lag time associated with usage of laboratory confirmed cases for surveillance purposes. To address this, syndromic surveillance can be considered to provide a timelier alternative for first-line screening. Existing syndromic surveillance solutions are however typically focused around a known disease and have limited capability to distinguish between outbreaks of individual diseases sharing similar syndromes. This poses a challenge for surveillance of COVID-19 as its active periods tend to overlap temporally with other influenza-like illnesses. In this study we explore performing sentinel syndromic surveillance for COVID-19 and other influenza-like illnesses using a deep learning-based approach. Our methods are based on aberration detection utilizing autoencoders that leverages symptom prevalence distributions to distinguish outbreaks of two ongoing diseases that share similar syndromes, even if they occur concurrently. We first demonstrate that this approach works for detection of outbreaks of influenza, which has known temporal boundaries. We then demonstrate that the autoencoder can be trained to not alert on known and well-managed influenza-like illnesses such as the common cold and influenza. Finally, we applied our approach to 2019-2020 data in the context of a COVID-19 syndromic surveillance task to demonstrate how implementation of such a system could have provided early warning of an outbreak of a novel influenza-like illness that did not match the symptom prevalence profile of influenza and other known influenza-like illnesses.


Subject(s)
COVID-19/epidemiology , Influenza, Human/epidemiology , Sentinel Surveillance , COVID-19/virology , Deep Learning , Disease Outbreaks , Humans , SARS-CoV-2/isolation & purification
7.
Curr Opin Electrochem ; 23: 174-184, 2020 Oct.
Article in English | MEDLINE | ID: covidwho-778680

ABSTRACT

Herein, we have summarized and argued about biomarkers and indicators used for the detection of severe acute respiratory syndrome coronavirus 2. Antibody detection methods are not considered suitable to screen individuals at early stages and asymptomatic cases. The diagnosis of coronavirus disease 2019 using biomarkers and indicators at point-of-care level is much crucial. Therefore, it is urgently needed to develop rapid and sensitive detection methods which can target antigens. We have critically elaborated key role of biosensors to cope the outbreak situation. In this review, the importance of biosensors including electrochemical, surface enhanced Raman scattering, field-effect transistor, and surface plasmon resonance biosensors in the detection of severe acute respiratory syndrome coronavirus 2 has been underscored. Finally, we have outlined pros and cons of diagnostic approaches and future directions.

8.
Mayo Clin Proc ; 95(11): 2370-2381, 2020 11.
Article in English | MEDLINE | ID: covidwho-722758

ABSTRACT

OBJECTIVE: To evaluate whether a digital surveillance model using Google Trends is feasible for obtaining accurate data on coronavirus disease 2019 and whether accurate predictions can be made regarding new cases. METHODS: Data on total and daily new cases in each US state were collected from January 22, 2020, to April 6, 2020. Information regarding 10 keywords was collected from Google Trends, and correlation analyses were performed for individual states as well as for the United States overall. RESULTS: Among the 10 keywords analyzed from Google Trends, face mask, Lysol, and COVID stimulus check had the strongest correlations when looking at the United States as a whole, with R values of 0.88, 0.82, and 0.79, respectively. Lag and lead Pearson correlations were assessed for every state and all 10 keywords from 16 days before the first case in each state to 16 days after the first case. Strong correlations were seen up to 16 days prior to the first reported cases in some states. CONCLUSION: This study documents the feasibility of syndromic surveillance of internet search terms to monitor new infectious diseases such as coronavirus disease 2019. This information could enable better preparation and planning of health care systems.


Subject(s)
Consumer Health Information , Coronavirus Infections/epidemiology , Information Seeking Behavior , Internet/trends , Pneumonia, Viral/epidemiology , Public Health Surveillance/methods , Search Engine/trends , Betacoronavirus , COVID-19 , Humans , Pandemics , SARS-CoV-2 , United States/epidemiology
9.
J Am Med Inform Assoc ; 27(9): 1437-1442, 2020 07 01.
Article in English | MEDLINE | ID: covidwho-610367

ABSTRACT

Large observational data networks that leverage routine clinical practice data in electronic health records (EHRs) are critical resources for research on coronavirus disease 2019 (COVID-19). Data normalization is a key challenge for the secondary use of EHRs for COVID-19 research across institutions. In this study, we addressed the challenge of automating the normalization of COVID-19 diagnostic tests, which are critical data elements, but for which controlled terminology terms were published after clinical implementation. We developed a simple but effective rule-based tool called COVID-19 TestNorm to automatically normalize local COVID-19 testing names to standard LOINC (Logical Observation Identifiers Names and Codes) codes. COVID-19 TestNorm was developed and evaluated using 568 test names collected from 8 healthcare systems. Our results show that it could achieve an accuracy of 97.4% on an independent test set. COVID-19 TestNorm is available as an open-source package for developers and as an online Web application for end users (https://clamp.uth.edu/covid/loinc.php). We believe that it will be a useful tool to support secondary use of EHRs for research on COVID-19.


Subject(s)
Betacoronavirus , Clinical Laboratory Techniques/classification , Coronavirus Infections/diagnosis , Logical Observation Identifiers Names and Codes , Pneumonia, Viral/diagnosis , Terminology as Topic , COVID-19 , COVID-19 Testing , Coronavirus Infections/classification , Electronic Health Records , Humans , Pandemics , SARS-CoV-2
10.
J Am Med Inform Assoc ; 27(8): 1259-1267, 2020 08 01.
Article in English | MEDLINE | ID: covidwho-381884

ABSTRACT

OBJECTIVE: As coronavirus disease 2019 (COVID-19) started its rapid emergence and gradually transformed into an unprecedented pandemic, the need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine-readable dataset known as the COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist association detection among COVID-19-related biomedical entities. MATERIALS AND METHODS: Leveraging a Linked Data version of CORD-19 (ie, CORD-19-on-FHIR), we first utilized SPARQL to extract co-occurrences among chemicals, diseases, genes, and mutations and build a co-occurrence network. We then trained the representation of the derived co-occurrence network using node2vec with 4 edge embeddings operations (L1, L2, Average, and Hadamard). Six algorithms (decision tree, logistic regression, support vector machine, random forest, naïve Bayes, and multilayer perceptron) were applied to evaluate performance on link prediction. An unsupervised learning strategy was also developed incorporating the t-SNE (t-distributed stochastic neighbor embedding) and DBSCAN (density-based spatial clustering of applications with noise) algorithms for case studies. RESULTS: The random forest classifier showed the best performance on link prediction across different network embeddings. For edge embeddings generated using the Average operation, random forest achieved the optimal average precision of 0.97 along with a F1 score of 0.90. For unsupervised learning, 63 clusters were formed with silhouette score of 0.128. Significant associations were detected for 5 coronavirus infectious diseases in their corresponding subgroups. CONCLUSIONS: In this study, we constructed COVID-19-centered co-occurrence network embeddings. Results indicated that the generated embeddings were able to extract significant associations for COVID-19 and coronavirus infectious diseases.


Subject(s)
Algorithms , Coronavirus Infections , Neural Networks, Computer , Pandemics , Pneumonia, Viral , Bayes Theorem , COVID-19 , Datasets as Topic , Decision Trees , Humans , Logistic Models , ROC Curve , Software , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL
...