Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
Add filters

Document Type
Year range
1.
Sci Rep ; 11(1): 24491, 2021 12 29.
Article in English | MEDLINE | ID: covidwho-1591547

ABSTRACT

There is an ongoing need for scientific analysis to help governments and public health authorities make decisions regarding the COVID-19 pandemic. This article presents a methodology based on data mining that can offer support for coping with epidemic diseases. The methodological approach was applied in São Paulo, Rio de Janeiro and Manaus, the cities in Brazil with the most COVID-19 deaths until the first half of 2021. We aimed to predict the evolution of COVID-19 in metropolises and identify air quality and meteorological variables correlated with confirmed cases and deaths. The statistical analyses indicated the most important explanatory environmental variables, while the cluster analyses showed the potential best input variables for the forecasting models. The forecast models were built by two different algorithms and their results have been compared. The relationship between epidemiological and environmental variables was particular to each of the three cities studied. Low solar radiation periods predicted in Manaus can guide managers to likely increase deaths due to COVID-19. In São Paulo, an increase in the mortality rate can be indicated by drought periods. The developed models can predict new cases and deaths by COVID-19 in studied cities. Furthermore, the methodological approach can be applied in other cities and for other epidemic diseases.


Subject(s)
COVID-19/epidemiology , COVID-19/mortality , Data Mining/methods , Brazil/epidemiology , COVID-19/pathology , Cities/epidemiology , Humans , Models, Theoretical , Morbidity , Pandemics/prevention & control , SARS-CoV-2/pathogenicity
2.
J Biomed Semantics ; 12(1): 13, 2021 07 18.
Article in English | MEDLINE | ID: covidwho-1484319

ABSTRACT

BACKGROUND: Effective response to public health emergencies, such as we are now experiencing with COVID-19, requires data sharing across multiple disciplines and data systems. Ontologies offer a powerful data sharing tool, and this holds especially for those ontologies built on the design principles of the Open Biomedical Ontologies Foundry. These principles are exemplified by the Infectious Disease Ontology (IDO), a suite of interoperable ontology modules aiming to provide coverage of all aspects of the infectious disease domain. At its center is IDO Core, a disease- and pathogen-neutral ontology covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is extended by disease and pathogen-specific ontology modules. RESULTS: To assist the integration and analysis of COVID-19 data, and viral infectious disease data more generally, we have recently developed three new IDO extensions: IDO Virus (VIDO); the Coronavirus Infectious Disease Ontology (CIDO); and an extension of CIDO focusing on COVID-19 (IDO-COVID-19). Reflecting the fact that viruses lack cellular parts, we have introduced into IDO Core the term acellular structure to cover viruses and other acellular entities studied by virologists. We now distinguish between infectious agents - organisms with an infectious disposition - and infectious structures - acellular structures with an infectious disposition. This in turn has led to various updates and refinements of IDO Core's content. We believe that our work on VIDO, CIDO, and IDO-COVID-19 can serve as a model for yielding greater conformance with ontology building best practices. CONCLUSIONS: IDO provides a simple recipe for building new pathogen-specific ontologies in a way that allows data about novel diseases to be easily compared, along multiple dimensions, with data represented by existing disease ontologies. The IDO strategy, moreover, supports ontology coordination, providing a powerful method of data integration and sharing that allows physicians, researchers, and public health organizations to respond rapidly and efficiently to current and future public health crises.


Subject(s)
Biological Ontologies/statistics & numerical data , COVID-19/prevention & control , Communicable Disease Control/statistics & numerical data , Communicable Diseases/therapy , Computational Biology/statistics & numerical data , SARS-CoV-2/isolation & purification , COVID-19/epidemiology , COVID-19/virology , Communicable Disease Control/methods , Communicable Diseases/epidemiology , Communicable Diseases/transmission , Computational Biology/methods , Data Mining/methods , Data Mining/statistics & numerical data , Epidemics , Humans , Information Dissemination/methods , Public Health/methods , Public Health/statistics & numerical data , SARS-CoV-2/physiology , Semantics
3.
Molecules ; 25(12)2020 Jun 26.
Article in English | MEDLINE | ID: covidwho-1389454

ABSTRACT

Viruses can be spread from one person to another; therefore, they may cause disorders in many people, sometimes leading to epidemics and even pandemics. New, previously unstudied viruses and some specific mutant or recombinant variants of known viruses constantly appear. An example is a variant of coronaviruses (CoV) causing severe acute respiratory syndrome (SARS), named SARS-CoV-2. Some antiviral drugs, such as remdesivir as well as antiretroviral drugs including darunavir, lopinavir, and ritonavir are suggested to be effective in treating disorders caused by SARS-CoV-2. There are data on the utilization of antiretroviral drugs against SARS-CoV-2. Since there are many studies aimed at the identification of the molecular mechanisms of human immunodeficiency virus type 1 (HIV-1) infection and the development of novel therapeutic approaches against HIV-1, we used HIV-1 for our case study to identify possible molecular pathways shared by SARS-CoV-2 and HIV-1. We applied a text and data mining workflow and identified a list of 46 targets, which can be essential for the development of infections caused by SARS-CoV-2 and HIV-1. We show that SARS-CoV-2 and HIV-1 share some molecular pathways involved in inflammation, immune response, cell cycle regulation.


Subject(s)
Coronavirus Infections/epidemiology , Coronavirus Infections/metabolism , Data Mining/methods , HIV Infections/epidemiology , HIV Infections/metabolism , Host-Pathogen Interactions/immunology , Pandemics , Pneumonia, Viral/epidemiology , Pneumonia, Viral/metabolism , Anti-Inflammatory Agents/therapeutic use , Antigens, Differentiation/genetics , Antigens, Differentiation/immunology , Antiviral Agents/therapeutic use , Betacoronavirus/drug effects , Betacoronavirus/immunology , Betacoronavirus/pathogenicity , COVID-19 , Complement System Proteins/genetics , Complement System Proteins/immunology , Coronavirus Infections/drug therapy , Coronavirus Infections/immunology , Databases, Genetic , Gene Expression Regulation , HIV Infections/drug therapy , HIV Infections/immunology , HIV-1/drug effects , HIV-1/immunology , HIV-1/pathogenicity , Host-Pathogen Interactions/drug effects , Host-Pathogen Interactions/genetics , Humans , Immunity, Innate/drug effects , Immunologic Factors/therapeutic use , Inflammation , Interferons/genetics , Interferons/immunology , Interleukins/genetics , Interleukins/immunology , Metabolic Networks and Pathways/drug effects , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/immunology , Pneumonia, Viral/drug therapy , Pneumonia, Viral/immunology , Repressor Proteins/genetics , Repressor Proteins/immunology , SARS-CoV-2 , Signal Transduction , Toll-Like Receptors/genetics , Toll-Like Receptors/immunology , Ubiquitin-Protein Ligases/genetics , Ubiquitin-Protein Ligases/immunology
4.
Medicine (Baltimore) ; 100(32): e26713, 2021 Aug 13.
Article in English | MEDLINE | ID: covidwho-1358516

ABSTRACT

OBJECTIVE: The aim of this study is to investigate the impact of Coronavirus disease 2019 (COVID-19) on toothache patients through posts on Sina Weibo. METHODS: Using Gooseeker, we searched and screened 24,108 posts about toothache on Weibo during the dental clinical closure period of China (February 1, 2020-February 29, 2020), and then divided them into 4 categories (causes of toothache, treatments of toothache, impacts of COVID-19 on toothache treatment, popular science articles of toothache), including 10 subcategories, to analyze the proportion of posts in each category. RESULTS: There were 12,603 postings closely related to toothache. Among them, 87.6% of posts did not indicate a specific cause of pain, and 92.8% of posts did not clearly indicate a specific method of treatment. There were 38.9% of the posts that clearly showed that their dental treatment of toothache was affected by COVID-19, including 10.5% of the posts in which patients were afraid to see the dentists because of COVID-19, and 28.4% of the posts in which patients were unable to see the dentists because the dental clinic was closed. Only 3.5% of all posts were about popular science of toothache. CONCLUSIONS: We have studied and analyzed social media data about toothache during the COVID-19 epidemic, so as to provide some insights for government organizations, the media and dentists to better guide the public to pay attention to oral health through social media. Research on social media data can help formulate public health policies.


Subject(s)
COVID-19/complications , Social Media/statistics & numerical data , Toothache/complications , COVID-19/epidemiology , COVID-19/psychology , China/epidemiology , Data Mining/methods , Data Mining/statistics & numerical data , Humans , Oral Health/standards , Oral Health/trends , Toothache/epidemiology , Toothache/psychology
5.
Brief Bioinform ; 22(2): 781-799, 2021 03 22.
Article in English | MEDLINE | ID: covidwho-1352111

ABSTRACT

More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult for researchers, clinicians and public health officials to keep up with the latest findings. Automated text mining techniques for searching, reading and summarizing papers are helpful for addressing information overload. In this review, we describe the many resources that have been introduced to support text mining applications over the COVID-19 literature; specifically, we discuss the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19. We compile a list of 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature. For each system, we provide a qualitative description and assessment of the system's performance, unique data or user interface features and modeling decisions. Many systems focus on search and discovery, though several systems provide novel features, such as the ability to summarize findings over multiple documents or linking between scientific articles and clinical trials. We also describe the public corpora, models and shared tasks that have been introduced to help reduce repeated effort among community members; some of these resources (especially shared tasks) can provide a basis for comparing the performance of different systems. Finally, we summarize promising results and open challenges for text mining the COVID-19 literature.


Subject(s)
COVID-19/epidemiology , Data Mining/methods , COVID-19/virology , Humans , SARS-CoV-2/isolation & purification
6.
Annu Rev Biomed Data Sci ; 4: 313-339, 2021 07 20.
Article in English | MEDLINE | ID: covidwho-1346098

ABSTRACT

The COVID-19 (coronavirus disease 2019) pandemic has had a significant impact on society, both because of the serious health effects of COVID-19 and because of public health measures implemented to slow its spread. Many of these difficulties are fundamentally information needs; attempts to address these needs have caused an information overload for both researchers and the public. Natural language processing (NLP)-the branch of artificial intelligence that interprets human language-can be applied to address many of the information needs made urgent by the COVID-19 pandemic. This review surveys approximately 150 NLP studies and more than 50 systems and datasets addressing the COVID-19 pandemic. We detail work on four core NLP tasks: information retrieval, named entity recognition, literature-based discovery, and question answering. We also describe work that directly addresses aspects of the pandemic through four additional tasks: topic modeling, sentiment and emotion analysis, caseload forecasting, and misinformation detection. We conclude by discussing observable trends and remaining challenges.


Subject(s)
COVID-19/epidemiology , Information Storage and Retrieval/methods , Natural Language Processing , Communication , Data Mining/methods , Datasets as Topic , Emotions , Humans , Knowledge Discovery , Pandemics , Periodicals as Topic , Software
7.
Comput Math Methods Med ; 2021: 4602465, 2021.
Article in English | MEDLINE | ID: covidwho-1309865

ABSTRACT

Dementia interferes with the individual's motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.


Subject(s)
AIDS Dementia Complex/diagnosis , Acquired Immunodeficiency Syndrome/complications , Algorithms , Dementia/etiology , AIDS Dementia Complex/epidemiology , AIDS Dementia Complex/etiology , Aged , Brazil/epidemiology , Computational Biology , Data Mining/methods , Data Mining/statistics & numerical data , Databases, Factual , Decision Trees , Female , Follow-Up Studies , Humans , Logistic Models , Machine Learning , Male , Middle Aged , Neural Networks, Computer , Risk Factors
8.
Front Immunol ; 12: 678570, 2021.
Article in English | MEDLINE | ID: covidwho-1295637

ABSTRACT

Passive immunization using monoclonal antibodies will play a vital role in the fight against COVID-19. The recent emergence of viral variants with reduced sensitivity to some current antibodies and vaccines highlights the importance of broad cross-reactivity. This study describes deep-mining of the antibody repertoires of hospitalized COVID-19 patients using phage display technology and B cell receptor (BCR) repertoire sequencing to isolate neutralizing antibodies and gain insights into the early antibody response. This comprehensive discovery approach has yielded a panel of potent neutralizing antibodies which bind distinct viral epitopes including epitopes conserved in SARS-CoV-1. Structural determination of a non-ACE2 receptor blocking antibody reveals a previously undescribed binding epitope, which is unlikely to be affected by the mutations in any of the recently reported major viral variants including B.1.1.7 (from the UK), B.1.351 (from South Africa) and B.1.1.28 (from Brazil). Finally, by combining sequences of the RBD binding and neutralizing antibodies with the B cell receptor repertoire sequencing, we also describe a highly convergent early antibody response. Similar IgM-derived sequences occur within this study group and also within patient responses described by multiple independent studies published previously.


Subject(s)
Antibodies, Monoclonal/therapeutic use , Antibodies, Neutralizing/therapeutic use , COVID-19/prevention & control , COVID-19/therapy , SARS-CoV-2/immunology , Spike Glycoprotein, Coronavirus/immunology , Antibodies, Monoclonal/immunology , Antibodies, Neutralizing/immunology , Antibodies, Viral/immunology , COVID-19/immunology , Cell Surface Display Techniques/methods , Data Mining/methods , Epitopes/immunology , Humans , Immunization, Passive/methods
9.
Eur J Immunol ; 51(8): 1992-2005, 2021 08.
Article in English | MEDLINE | ID: covidwho-1251932

ABSTRACT

The phenotype of infused cells is a major determinant of Adoptive T-cell therapy (ACT) efficacy. Yet, the difficulty in deciphering multiparametric cytometry data limited the fine characterization of cellular products. To allow the analysis of dynamic and complex flow cytometry samples, we developed cytoChain, a novel dataset mining tool and a new analytical workflow. CytoChain was challenged to compare state-of-the-art and innovative culture conditions to generate stem-like memory cells (TSCM ) suitable for ACT. Noticeably, the combination of IL-7/15 and superoxides scavenging sustained the emergence of a previously unidentified nonexhausted Fit-TSCM signature, overlooked by manual gating and endowed with superior expansion potential. CytoChain proficiently traced back this population in independent datasets, and in T-cell receptor engineered lymphocytes. CytoChain flexibility and function were then further validated on a published dataset from circulating T cells in COVID-19 patients. Collectively, our results support the use of cytoChain to identify novel, functionally critical immunophenotypes for ACT and patients immunomonitoring.


Subject(s)
Data Mining/methods , Flow Cytometry/methods , Receptors, Antigen, T-Cell/immunology , Receptors, Chimeric Antigen/immunology , T-Lymphocytes/immunology , T-Lymphocytes/metabolism , COVID-19/blood , COVID-19/immunology , Cytokines/metabolism , Genetic Engineering , Humans , Immunologic Memory , Immunophenotyping , Immunotherapy, Adoptive , Receptors, Antigen, T-Cell/genetics , Receptors, Antigen, T-Cell/metabolism , Receptors, Chimeric Antigen/genetics , SARS-CoV-2/immunology
10.
Brief Funct Genomics ; 20(3): 181-195, 2021 06 09.
Article in English | MEDLINE | ID: covidwho-1246686

ABSTRACT

With the development of high-throughput sequencing technology, biological sequence data reflecting life information becomes increasingly accessible. Particularly on the background of the COVID-19 pandemic, biological sequence data play an important role in detecting diseases, analyzing the mechanism and discovering specific drugs. In recent years, pretraining models that have emerged in natural language processing have attracted widespread attention in many research fields not only to decrease training cost but also to improve performance on downstream tasks. Pretraining models are used for embedding biological sequence and extracting feature from large biological sequence corpus to comprehensively understand the biological sequence data. In this survey, we provide a broad review on pretraining models for biological sequence data. Moreover, we first introduce biological sequences and corresponding datasets, including brief description and accessible link. Subsequently, we systematically summarize popular pretraining models for biological sequences based on four categories: CNN, word2vec, LSTM and Transformer. Then, we present some applications with proposed pretraining models on downstream tasks to explain the role of pretraining models. Next, we provide a novel pretraining scheme for protein sequences and a multitask benchmark for protein pretraining models. Finally, we discuss the challenges and future directions in pretraining models for biological sequences.


Subject(s)
Algorithms , Computational Biology/methods , Data Mining/methods , High-Throughput Nucleotide Sequencing/methods , Natural Language Processing , Software , Datasets as Topic , Deep Learning , Humans , Models, Theoretical
11.
J Med Internet Res ; 23(6): e28253, 2021 06 02.
Article in English | MEDLINE | ID: covidwho-1202100

ABSTRACT

BACKGROUND: Before the advent of an effective vaccine, nonpharmaceutical interventions, such as mask-wearing, social distancing, and lockdowns, have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population-wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place. OBJECTIVE: Here we analyzed online news media coverage of COVID-19. We quantified the total volume of COVID-19 articles, their sentiment polarization, and leading subtopics to act as a reference to inform future communication strategies. METHODS: We collected 26 million news articles from the front pages of 172 major online news sources in 11 countries (available online at SciRide). Using topic detection, we identified COVID-19-related content to quantify the proportion of total coverage the pandemic received in 2020. The sentiment analysis tool Vader was employed to stratify the emotional polarity of COVID-19 reporting. Further topic detection and sentiment analysis was performed on COVID-19 coverage to reveal the leading themes in pandemic reporting and their respective emotional polarizations. RESULTS: We found that COVID-19 coverage accounted for approximately 25.3% of all front-page online news articles between January and October 2020. Sentiment analysis of English-language sources revealed that overall COVID-19 coverage was not exclusively negatively polarized, suggesting wide heterogeneous reporting of the pandemic. Within this heterogenous coverage, 16% of COVID-19 news articles (or 4% of all English-language articles) can be classified as highly negatively polarized, citing issues such as death, fear, or crisis. CONCLUSIONS: The goal of COVID-19 public health communication is to increase understanding of distancing rules and to maximize the impact of governmental policy. The extent to which the quantity and quality of information from different communication channels (eg, social media, government pages, and news) influence public understanding of public health measures remains to be established. Here we conclude that a quarter of all reporting in 2020 covered COVID-19, which is indicative of information overload. In this capacity, our data and analysis form a quantitative basis for informing health communication strategies along traditional news media channels to minimize the risks of COVID-19 while vaccination is rolled out.


Subject(s)
COVID-19/epidemiology , Data Mining/methods , Mass Media/statistics & numerical data , Public Health/methods , Social Media/statistics & numerical data , Health Resources , Humans , Pandemics , SARS-CoV-2/isolation & purification
12.
J Gerontol B Psychol Sci Soc Sci ; 76(9): 1808-1816, 2021 10 30.
Article in English | MEDLINE | ID: covidwho-1160335

ABSTRACT

OBJECTIVES: Older adults experience higher risks of getting severely ill from coronavirus disease 2019 (COVID-19), resulting in widespread narratives of frailty and vulnerability. We test: (a) whether global aging narratives have become more negative from before to during the pandemic (October 2019 to May 2020) across 20 countries; (b) model pandemic (incidence and mortality), and cultural factors associated with the trajectory of aging narratives. METHODS: We leveraged a 10-billion-word online-media corpus, consisting of 28 million newspaper and magazine articles across 20 countries, to identify nine common synonyms of "older adults" and compiled their most frequently used descriptors (collocates) from October 2019 to May 2020-culminating in 11,504 collocates that were rated to create a Cumulative Aging Narrative Score per month. Widely used cultural dimension scores were taken from Hofstede, and pandemic variables, from the Oxford COVID-19 Government Response Tracker. RESULTS: Aging narratives became more negative as the pandemic worsened across 20 countries. Globally, scores were trending neutral from October 2019 to February 2020, and plummeted in March 2020, reflecting COVID-19's severity. Prepandemic (October 2019), the United Kingdom evidenced the most negative aging narratives; peak pandemic (May 2020), South Africa took on the dubious honor. Across the 8-month period, the Philippines experienced the steepest trend toward negativity in aging narratives. Ageism, during the pandemic, was, ironically, not predicted by COVID-19's incidence and mortality rates, but by cultural variables: Individualism, Masculinity, Uncertainty Avoidance, and Long-term Orientation. DISCUSSION: The strategy to reverse this trajectory lay in the same phenomenon that promoted it: a sustained global campaign-though, it should be culturally nuanced and customized to a country's context.


Subject(s)
Ageism , Aging , COVID-19 , Cultural Deprivation , Narrative Medicine , Social Perception , Aged , Ageism/ethnology , Ageism/prevention & control , Ageism/psychology , Ageism/trends , Aging/ethics , Aging/psychology , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19/psychology , Data Mining/methods , Data Mining/statistics & numerical data , Global Health , Health Status Disparities , Humans , Incidence , Narrative Medicine/ethics , Narrative Medicine/methods , Narrative Medicine/trends , Psychology , SARS-CoV-2
13.
Sci Rep ; 11(1): 6725, 2021 03 24.
Article in English | MEDLINE | ID: covidwho-1149749

ABSTRACT

The recent global pandemic of the Coronavirus disease 2019 (COVID-19) caused by the new coronavirus SARS-CoV-2 presents an urgent need for the development of new therapeutic candidates. Many efforts have been devoted to screening existing drug libraries with the hope to repurpose approved drugs as potential treatments for COVID-19. However, the antiviral mechanisms of action of the drugs found active in these phenotypic screens remain largely unknown. In an effort to deconvolute the viral targets in pursuit of more effective anti-COVID-19 drug development, we mined our in-house database of approved drug screens against 994 assays and compared their activity profiles with the drug activity profile in a cytopathic effect (CPE) assay of SARS-CoV-2. We found that the autophagy and AP-1 signaling pathway activity profiles are significantly correlated with the anti-SARS-CoV-2 activity profile. In addition, a class of neurology/psychiatry drugs was found to be significantly enriched with anti-SARS-CoV-2 activity. Taken together, these results provide new insights into SARS-CoV-2 infection and potential targets for COVID-19 therapeutics, which can be further validated by in vivo animal studies and human clinical trials.


Subject(s)
COVID-19/drug therapy , COVID-19/metabolism , Data Mining/methods , Transcription Factor AP-1/metabolism , Animals , Antiviral Agents/pharmacology , Autophagy/drug effects , Autophagy/physiology , COVID-19/epidemiology , COVID-19/genetics , Chlorocebus aethiops , Databases, Genetic , Drug Approval , Drug Evaluation, Preclinical/methods , Drug Repositioning/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Molecular Targeted Therapy , Pandemics , SARS-CoV-2/isolation & purification , Vero Cells
14.
Sci Rep ; 11(1): 6811, 2021 03 24.
Article in English | MEDLINE | ID: covidwho-1149746

ABSTRACT

High rate of cardiovascular disease (CVD) has been reported among patients with coronavirus disease 2019 (COVID-19). Importantly, CVD, as one of the comorbidities, could also increase the risks of the severity of COVID-19. Here we identified phospholipase A2 group VII (PLA2G7), a well-studied CVD biomarker, as a hub gene in COVID-19 though an integrated hypothesis-free genomic analysis on nasal swabs (n = 486) from patients with COVID-19. PLA2G7 was further found to be predominantly expressed by proinflammatory macrophages in lungs emerging with progression of COVID-19. In the validation stage, RNA level of PLA2G7 was identified in nasal swabs from both COVID-19 and pneumonia patients, other than health individuals. The positive rate of PLA2G7 were correlated with not only viral loads but also severity of pneumonia in non-COVID-19 patients. Serum protein levels of PLA2G7 were found to be elevated and beyond the normal limit in COVID-19 patients, especially among those re-positive patients. We identified and validated PLA2G7, a biomarker for CVD, was abnormally enhanced in COVID-19 at both nucleotide and protein aspects. These findings provided indications into the prevalence of cardiovascular involvements seen in patients with COVID-19. PLA2G7 could be a potential prognostic and therapeutic target in COVID-19.


Subject(s)
1-Alkyl-2-acetylglycerophosphocholine Esterase/metabolism , COVID-19/metabolism , Cardiovascular Diseases/metabolism , Macrophages/metabolism , 1-Alkyl-2-acetylglycerophosphocholine Esterase/blood , 1-Alkyl-2-acetylglycerophosphocholine Esterase/genetics , Biomarkers/metabolism , COVID-19/epidemiology , COVID-19/immunology , COVID-19/pathology , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/virology , China/epidemiology , Data Mining/methods , Humans , Macrophages/immunology , Macrophages/pathology , Polymorphism, Single Nucleotide , SARS-CoV-2/isolation & purification , Transcriptional Activation , Up-Regulation
15.
Nucleic Acids Res ; 49(D1): D1113-D1121, 2021 01 08.
Article in English | MEDLINE | ID: covidwho-1139997

ABSTRACT

The recent outbreak of COVID-19 has generated an enormous amount of Big Data. To date, the COVID-19 Open Research Dataset (CORD-19), lists ∼130,000 articles from the WHO COVID-19 database, PubMed Central, medRxiv, and bioRxiv, as collected by Semantic Scholar. According to LitCovid (11 August 2020), ∼40,300 COVID19-related articles are currently listed in PubMed. It has been shown in clinical settings that the analysis of past research results and the mining of available data can provide novel opportunities for the successful application of currently approved therapeutics and their combinations for the treatment of conditions caused by a novel SARS-CoV-2 infection. As such, effective responses to the pandemic require the development of efficient applications, methods and algorithms for data navigation, text-mining, clustering, classification, analysis, and reasoning. Thus, our COVID19 Drug Repository represents a modular platform for drug data navigation and analysis, with an emphasis on COVID-19-related information currently being reported. The COVID19 Drug Repository enables users to focus on different levels of complexity, starting from general information about (FDA-) approved drugs, PubMed references, clinical trials, recipes as well as the descriptions of molecular mechanisms of drugs' action. Our COVID19 drug repository provide a most updated world-wide collection of drugs that has been repurposed for COVID19 treatments around the world.


Subject(s)
Antiviral Agents/therapeutic use , COVID-19/drug therapy , Databases, Pharmaceutical/statistics & numerical data , Drug Repositioning/statistics & numerical data , SARS-CoV-2/drug effects , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19/virology , Clinical Trials as Topic/methods , Clinical Trials as Topic/statistics & numerical data , Data Mining/methods , Data Mining/statistics & numerical data , Drug Approval/statistics & numerical data , Drug Repositioning/methods , Epidemics , Humans , Machine Learning , SARS-CoV-2/physiology
16.
Sci Rep ; 11(1): 5322, 2021 03 05.
Article in English | MEDLINE | ID: covidwho-1118817

ABSTRACT

The COVID-19 pandemic has devastated the world with health and economic wreckage. Precise estimates of adverse outcomes from COVID-19 could have led to better allocation of healthcare resources and more efficient targeted preventive measures, including insight into prioritizing how to best distribute a vaccination. We developed MLHO (pronounced as melo), an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health Outcomes. MLHO implements iterative sequential representation mining, and feature and model selection, for predicting patient-level risk of hospitalization, ICU admission, need for mechanical ventilation, and death. It bases this prediction on data from patients' past medical records (before their COVID-19 infection). MLHO's architecture enables a parallel and outcome-oriented model calibration, in which different statistical learning algorithms and vectors of features are simultaneously tested to improve prediction of health outcomes. Using clinical and demographic data from a large cohort of over 13,000 COVID-19-positive patients, we modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics. The mean AUC ROC for mortality prediction was 0.91, while the prediction performance ranged between 0.80 and 0.81 for the ICU, hospitalization, and ventilation. We broadly describe the clusters of features that were utilized in modeling and their relative influence for predicting each outcome. Our results demonstrated that while demographic variables (namely age) are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model. As the COVID-19 pandemic unfolds around the world, adaptable and interpretable machine learning frameworks (like MLHO) are crucial to improve our readiness for confronting the potential future waves of COVID-19, as well as other novel infectious diseases that may emerge.


Subject(s)
COVID-19/mortality , Data Mining/methods , Machine Learning , Models, Statistical , Adult , Age Factors , Aged , Aged, 80 and over , COVID-19/diagnosis , COVID-19/therapy , COVID-19/virology , Electronic Health Records/statistics & numerical data , Female , Hospitalization/statistics & numerical data , Humans , Intensive Care Units/statistics & numerical data , Male , Middle Aged , Pandemics/statistics & numerical data , Prognosis , ROC Curve , Reproducibility of Results , Respiration, Artificial/statistics & numerical data , Retrospective Studies , Risk Assessment/methods , Risk Factors , SARS-CoV-2/isolation & purification , SARS-CoV-2/pathogenicity
17.
Nucleic Acids Res ; 49(D1): D1152-D1159, 2021 01 08.
Article in English | MEDLINE | ID: covidwho-1117392

ABSTRACT

The current state of the COVID-19 pandemic is a global health crisis. To fight the novel coronavirus, one of the best-known ways is to block enzymes essential for virus replication. Currently, we know that the SARS-CoV-2 virus encodes about 29 proteins such as spike protein, 3C-like protease (3CLpro), RNA-dependent RNA polymerase (RdRp), Papain-like protease (PLpro), and nucleocapsid (N) protein. SARS-CoV-2 uses human angiotensin-converting enzyme 2 (ACE2) for viral entry and transmembrane serine protease family member II (TMPRSS2) for spike protein priming. Thus in order to speed up the discovery of potential drugs, we develop DockCoV2, a drug database for SARS-CoV-2. DockCoV2 focuses on predicting the binding affinity of FDA-approved and Taiwan National Health Insurance (NHI) drugs with the seven proteins mentioned above. This database contains a total of 3,109 drugs. DockCoV2 is easy to use and search against, is well cross-linked to external databases, and provides the state-of-the-art prediction results in one site. Users can download their drug-protein docking data of interest and examine additional drug-related information on DockCoV2. Furthermore, DockCoV2 provides experimental information to help users understand which drugs have already been reported to be effective against MERS or SARS-CoV. DockCoV2 is available at https://covirus.cc/drugs/.


Subject(s)
Antiviral Agents/therapeutic use , COVID-19/drug therapy , Databases, Pharmaceutical/statistics & numerical data , SARS-CoV-2/drug effects , Antiviral Agents/metabolism , COVID-19/epidemiology , COVID-19/virology , Data Curation/methods , Data Mining/methods , Humans , Internet , Models, Molecular , Pandemics , Protein Binding/drug effects , Protein Domains , SARS-CoV-2/metabolism , SARS-CoV-2/physiology , Viral Proteins/chemistry , Viral Proteins/metabolism , Virus Replication/drug effects
18.
Nucleic Acids Res ; 49(D1): D1534-D1540, 2021 01 08.
Article in English | MEDLINE | ID: covidwho-1117391

ABSTRACT

Since the outbreak of the current pandemic in 2020, there has been a rapid growth of published articles on COVID-19 and SARS-CoV-2, with about 10,000 new articles added each month. This is causing an increasingly serious information overload, making it difficult for scientists, healthcare professionals and the general public to remain up to date on the latest SARS-CoV-2 and COVID-19 research. Hence, we developed LitCovid (https://www.ncbi.nlm.nih.gov/research/coronavirus/), a curated literature hub, to track up-to-date scientific information in PubMed. LitCovid is updated daily with newly identified relevant articles organized into curated categories. To support manual curation, advanced machine-learning and deep-learning algorithms have been developed, evaluated and integrated into the curation workflow. To the best of our knowledge, LitCovid is the first-of-its-kind COVID-19-specific literature resource, with all of its collected articles and curated data freely available. Since its release, LitCovid has been widely used, with millions of accesses by users worldwide for various information needs, such as evidence synthesis, drug discovery and text and data mining, among others.


Subject(s)
COVID-19/prevention & control , Data Curation/statistics & numerical data , Data Mining/statistics & numerical data , Databases, Factual , PubMed/statistics & numerical data , SARS-CoV-2/isolation & purification , COVID-19/epidemiology , COVID-19/virology , Data Curation/methods , Data Mining/methods , Humans , Internet , Machine Learning , Pandemics , Publications/statistics & numerical data , SARS-CoV-2/physiology
19.
PLoS One ; 16(3): e0247995, 2021.
Article in English | MEDLINE | ID: covidwho-1115307

ABSTRACT

BACKGROUND: Primary care is the major point of access in most health systems in developed countries and therefore for the detection of coronavirus disease 2019 (COVID-19) cases. The quality of its IT systems, together with access to the results of mass screening with Polymerase chain reaction (PCR) tests, makes it possible to analyse the impact of various concurrent factors on the likelihood of contracting the disease. METHODS AND FINDINGS: Through data mining techniques with the sociodemographic and clinical variables recorded in patient's medical histories, a decision tree-based logistic regression model has been proposed which analyses the significance of demographic and clinical variables in the probability of having a positive PCR in a sample of 7,314 individuals treated in the Primary Care service of the public health system of Catalonia. The statistical approach to decision tree modelling allows 66.2% of diagnoses of infection by COVID-19 to be classified with a sensitivity of 64.3% and a specificity of 62.5%, with prior contact with a positive case being the primary predictor variable. CONCLUSIONS: The use of a classification tree model may be useful in screening for COVID-19 infection. Contact detection is the most reliable variable for detecting Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cases. The model would support that, beyond a symptomatic diagnosis, the best way to detect cases would be to engage in contact tracing.


Subject(s)
COVID-19/diagnosis , COVID-19/transmission , Disease Transmission, Infectious/statistics & numerical data , Adult , Aged , COVID-19/epidemiology , Cohort Studies , Contact Tracing , Data Mining/methods , Decision Trees , Female , Humans , Male , Mass Screening/methods , Middle Aged , Probability , Retrospective Studies , SARS-CoV-2/pathogenicity , Sensitivity and Specificity
20.
J Med Internet Res ; 23(3): e26482, 2021 03 05.
Article in English | MEDLINE | ID: covidwho-1094119

ABSTRACT

BACKGROUND: Since the beginning of the COVID-19 pandemic in late 2019, its far-reaching impacts have been witnessed globally across all aspects of human life, such as health, economy, politics, and education. Such widely penetrating impacts cast significant and profound burdens on all population groups, incurring varied concerns and sentiments among them. OBJECTIVE: This study aims to identify the concerns, sentiments, and disparities of various population groups during the COVID-19 pandemic through a cross-sectional study conducted via large-scale Twitter data mining infoveillance. METHODS: This study consisted of three steps: first, tweets posted during the pandemic were collected and preprocessed on a large scale; second, the key population attributes, concerns, sentiments, and emotions were extracted via a collection of natural language processing procedures; third, multiple analyses were conducted to reveal concerns, sentiments, and disparities among population groups during the pandemic. Overall, this study implemented a quick, effective, and economical approach for analyzing population-level disparities during a public health event. The source code developed in this study was released for free public use at GitHub. RESULTS: A total of 1,015,655 original English tweets posted from August 7 to 12, 2020, were acquired and analyzed to obtain the following results. Organizations were significantly more concerned about COVID-19 (odds ratio [OR] 3.48, 95% CI 3.39-3.58) and expressed more fear and depression emotions than individuals. Females were less concerned about COVID-19 (OR 0.73, 95% CI 0.71-0.75) and expressed less fear and depression emotions than males. Among all age groups (ie, ≤18, 19-29, 30-39, and ≥40 years of age), the attention ORs of COVID-19 fear and depression increased significantly with age. It is worth noting that not all females paid less attention to COVID-19 than males. In the age group of 40 years or older, females were more concerned than males, especially regarding the economic and education topics. In addition, males 40 years or older and 18 years or younger were the least positive. Lastly, in all sentiment analyses, the sentiment polarities regarding political topics were always the lowest among the five topics of concern across all population groups. CONCLUSIONS: Through large-scale Twitter data mining, this study revealed that meaningful differences regarding concerns and sentiments about COVID-19-related topics existed among population groups during the study period. Therefore, specialized and varied attention and support are needed for different population groups. In addition, the efficient analysis method implemented by our publicly released code can be utilized to dynamically track the evolution of each population group during the pandemic or any other major event for better informed public health research and interventions.


Subject(s)
COVID-19/epidemiology , Data Mining/methods , Social Media/supply & distribution , Adolescent , Adult , COVID-19/psychology , Cross-Sectional Studies , Female , Humans , Male , Pandemics , Population Groups , SARS-CoV-2/isolation & purification , Sex Factors , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...