Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 218
Filter
Add filters

Document Type
Year range
1.
PLoS One ; 16(12): e0262056, 2021.
Article in English | MEDLINE | ID: covidwho-1596737

ABSTRACT

Characterization of protein complexes, i.e. sets of proteins assembling into a single larger physical entity, is important, as such assemblies play many essential roles in cells such as gene regulation. From networks of protein-protein interactions, potential protein complexes can be identified computationally through the application of community detection methods, which flag groups of entities interacting with each other in certain patterns. Most community detection algorithms tend to be unsupervised and assume that communities are dense network subgraphs, which is not always true, as protein complexes can exhibit diverse network topologies. The few existing supervised machine learning methods are serial and can potentially be improved in terms of accuracy and scalability by using better-suited machine learning models and parallel algorithms. Here, we present Super.Complex, a distributed, supervised AutoML-based pipeline for overlapping community detection in weighted networks. We also propose three new evaluation measures for the outstanding issue of comparing sets of learned and known communities satisfactorily. Super.Complex learns a community fitness function from known communities using an AutoML method and applies this fitness function to detect new communities. A heuristic local search algorithm finds maximally scoring communities, and a parallel implementation can be run on a computer cluster for scaling to large networks. On a yeast protein-interaction network, Super.Complex outperforms 6 other supervised and 4 unsupervised methods. Application of Super.Complex to a human protein-interaction network with ~8k nodes and ~60k edges yields 1,028 protein complexes, with 234 complexes linked to SARS-CoV-2, the COVID-19 virus, with 111 uncharacterized proteins present in 103 learned complexes. Super.Complex is generalizable with the ability to improve results by incorporating domain-specific features. Learned community characteristics can also be transferred from existing applications to detect communities in a new application with no known communities. Code and interactive visualizations of learned human protein complexes are freely available at: https://sites.google.com/view/supercomplex/super-complex-v3-0.


Subject(s)
Computational Biology/methods , Protein Interaction Maps , Proteins/immunology , Supervised Machine Learning , Viral Proteins/immunology , COVID-19/immunology , Humans , Protein Binding , Protein Interaction Mapping , SARS-CoV-2/immunology
2.
Mol Med ; 27(1): 161, 2021 12 20.
Article in English | MEDLINE | ID: covidwho-1582119

ABSTRACT

BACKGROUND: Similarities in the hijacking mechanisms used by SARS-CoV-2 and several types of cancer, suggest the repurposing of cancer drugs to treat Covid-19. CK2 kinase antagonists have been proposed for cancer treatment. A recent study in cells infected with SARS-CoV-2 found a significant CK2 kinase activity, and the use of a CK2 inhibitor showed antiviral responses. CIGB-300, originally designed as an anticancer peptide, is an antagonist of CK2 kinase activity that binds to the CK2 phospho-acceptor sites. Recent preliminary results show the antiviral activity of CIGB-300 using a surrogate model of coronavirus. Here we present a computational biology study that provides evidence, at the molecular level, of how CIGB-300 may interfere with the SARS-CoV-2 life cycle within infected human cells. METHODS: Sequence analyses and data from phosphorylation studies were combined to predict infection-induced molecular mechanisms that can be interfered by CIGB-300. Next, we integrated data from multi-omics studies and data focusing on the antagonistic effect on the CK2 kinase activity of CIGB-300. A combination of network and functional enrichment analyses was used. RESULTS: Firstly, from the SARS-CoV studies, we inferred the potential incidence of CIGB-300 in SARS-CoV-2 interference on the immune response. Afterwards, from the analysis of multiple omics data, we proposed the action of CIGB-300 from the early stages of viral infections perturbing the virus hijacking of RNA splicing machinery. We also predicted the interference of CIGB-300 in virus-host interactions that are responsible for the high infectivity and the particular immune response to SARS-CoV-2 infection. Furthermore, we provided evidence of how CIGB-300 may participate in the attenuation of phenotypes related to muscle, bleeding, coagulation and respiratory disorders. CONCLUSIONS: Our computational analysis proposes putative molecular mechanisms that support the antiviral activity of CIGB-300.


Subject(s)
COVID-19/metabolism , Computational Biology/methods , Animals , COVID-19/drug therapy , Caco-2 Cells , Chlorocebus aethiops , Humans , Nuclear Pore Complex Proteins/therapeutic use , Peptides, Cyclic/therapeutic use , SARS-CoV-2/drug effects , SARS-CoV-2/pathogenicity , Vero Cells
3.
Eur J Med Res ; 26(1): 146, 2021 Dec 17.
Article in English | MEDLINE | ID: covidwho-1582003

ABSTRACT

BACKGROUND: At the end of 2019, the world witnessed the emergence and ravages of a viral infection induced by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Also known as the coronavirus disease 2019 (COVID-19), it has been identified as a public health emergency of international concern (PHEIC) by the World Health Organization (WHO) because of its severity. METHODS: The gene data of 51 samples were extracted from the GSE150316 and GSE147507 data set and then processed by means of the programming language R, through which the differentially expressed genes (DEGs) that meet the standards were screened. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed on the selected DEGs to understand the functions and approaches of DEGs. The online tool STRING was employed to construct a protein-protein interaction (PPI) network of DEGs and, in turn, to identify hub genes. RESULTS: A total of 52 intersection genes were obtained through DEG identification. Through the GO analysis, we realized that the biological processes (BPs) that have the deepest impact on the human body after SARS-CoV-2 infection are various immune responses. By using STRING to construct a PPI network, 10 hub genes were identified, including IFIH1, DDX58, ISG15, EGR1, OASL, SAMD9, SAMD9L, XAF1, IFITM1, and TNFSF10. CONCLUSION: The results of this study will hopefully provide guidance for future studies on the pathophysiological mechanism of SARS-CoV-2 infection.


Subject(s)
COVID-19/genetics , Computational Biology/methods , Gene Expression Regulation/genetics , Lung/pathology , Protein Interaction Maps/genetics , COVID-19/pathology , Databases, Genetic , Gene Expression Profiling , Gene Ontology , Humans , Immunity, Humoral/genetics , Immunity, Humoral/immunology , Lung/virology , Neutrophil Activation/genetics , Neutrophil Activation/immunology , Neutrophils/immunology , SARS-CoV-2 , Transcriptome/genetics
4.
Front Immunol ; 12: 774776, 2021.
Article in English | MEDLINE | ID: covidwho-1581334

ABSTRACT

Both RNA N6-methyladenosine (m6A) modification of SARS-CoV-2 and immune characteristics of the human body have been reported to play an important role in COVID-19, but how the m6A methylation modification of leukocytes responds to the virus infection remains unknown. Based on the RNA-seq of 126 samples from the GEO database, we disclosed that there is a remarkably higher m6A modification level of blood leukocytes in patients with COVID-19 compared to patients without COVID-19, and this difference was related to CD4+ T cells. Two clusters were identified by unsupervised clustering, m6A cluster A characterized by T cell activation had a higher prognosis than m6A cluster B. Elevated metabolism level, blockage of the immune checkpoint, and lower level of m6A score were observed in m6A cluster B. A protective model was constructed based on nine selected genes and it exhibited an excellent predictive value in COVID-19. Further analysis revealed that the protective score was positively correlated to HFD45 and ventilator-free days, while negatively correlated to SOFA score, APACHE-II score, and crp. Our works systematically depicted a complicated correlation between m6A methylation modification and host lymphocytes in patients infected with SARS-CoV-2 and provided a well-performing model to predict the patients' outcomes.


Subject(s)
Adenosine/analogs & derivatives , COVID-19/immunology , COVID-19/virology , Host-Pathogen Interactions/immunology , Leukocytes/immunology , RNA, Viral/genetics , SARS-CoV-2/physiology , Adenosine/metabolism , Cluster Analysis , Computational Biology/methods , Disease Susceptibility/immunology , Gene Expression Profiling , Humans , Leukocytes/metabolism , RNA, Viral/metabolism , ROC Curve
5.
Molecules ; 27(1)2021 Dec 30.
Article in English | MEDLINE | ID: covidwho-1580564

ABSTRACT

The COVID-19 pandemic has caused millions of fatalities since 2019. Despite the availability of vaccines for this disease, new strains are causing rapid ailment and are a continuous threat to vaccine efficacy. Here, molecular docking and simulations identify strong inhibitors of the allosteric site of the SARS-CoV-2 virus RNA dependent RNA polymerase (RdRp). More than one hundred different flavonoids were docked with the SARS-CoV-2 RdRp allosteric site through computational screening. The three top hits were Naringoside, Myricetin and Aureusidin 4,6-diglucoside. Simulation analyses confirmed that they are in constant contact during the simulation time course and have strong association with the enzyme's allosteric site. Absorption, distribution, metabolism, excretion and toxicity (ADMET) data provided medicinal information of these top three hits. They had good human intestinal absorption (HIA) concentrations and were non-toxic. Due to high mutation rates in the active sites of the viral enzyme, these new allosteric site inhibitors offer opportunities to drug SARS-CoV-2 RdRp. These results provide new information for the design of novel allosteric inhibitors against SARS-CoV-2 RdRp.


Subject(s)
Antiviral Agents/pharmacology , COVID-19/drug therapy , Computational Biology/methods , Coronavirus RNA-Dependent RNA Polymerase/antagonists & inhibitors , Drug Evaluation, Preclinical , Flavonoids/pharmacology , SARS-CoV-2/enzymology , Allosteric Site , COVID-19/virology , Catalytic Domain , Drug Design , Humans , Intestinal Absorption , Molecular Docking Simulation
6.
BMC Med Genomics ; 14(Suppl 6): 289, 2021 12 14.
Article in English | MEDLINE | ID: covidwho-1571758

ABSTRACT

BACKGROUND: Virus screening and viral genome reconstruction are urgent and crucial for the rapid identification of viral pathogens, i.e., tracing the source and understanding the pathogenesis when a viral outbreak occurs. Next-generation sequencing (NGS) provides an efficient and unbiased way to identify viral pathogens in host-associated and environmental samples without prior knowledge. Despite the availability of software, data analysis still requires human operations. A mature pipeline is urgently needed when thousands of viral pathogen and viral genome reconstruction samples need to be rapidly identified. RESULTS: In this paper, we present a rapid and accurate workflow to screen metagenomics sequencing data for viral pathogens and other compositions, as well as enable a reference-based assembler to reconstruct viral genomes. Moreover, we tested our workflow on several metagenomics datasets, including a SARS-CoV-2 patient sample with NGS data, pangolins tissues with NGS data, Middle East Respiratory Syndrome (MERS)-infected cells with NGS data, etc. Our workflow demonstrated high accuracy and efficiency when identifying target viruses from large scale NGS metagenomics data. Our workflow was flexible when working with a broad range of NGS datasets from small (kb) to large (100 Gb). This took from a few minutes to a few hours to complete each task. At the same time, our workflow automatically generates reports that incorporate visualized feedback (e.g., metagenomics data quality statistics, host and viral sequence compositions, details about each of the identified viral pathogens and their coverages, and reassembled viral pathogen sequences based on their closest references). CONCLUSIONS: Overall, our system enabled the rapid screening and identification of viral pathogens from metagenomics data, providing an important piece to support viral pathogen research during a pandemic. The visualized report contains information from raw sequence quality to a reconstructed viral sequence, which allows non-professional people to screen their samples for viruses by themselves (Additional file 1).


Subject(s)
COVID-19 Testing/methods , COVID-19/diagnosis , Computational Biology/methods , Genome, Viral , Genomics , Metagenomics , SARS-CoV-2/genetics , Algorithms , Animals , Automation , Coronavirus Infections/genetics , High-Throughput Nucleotide Sequencing , Humans , Mass Screening/methods , Pandemics , Pangolins , Reference Values , Software , Transcriptome , Workflow
7.
Int J Mol Sci ; 22(24)2021 Dec 07.
Article in English | MEDLINE | ID: covidwho-1554804

ABSTRACT

In the last few years, microRNA-mediated regulation has been shown to be important in viral infections. In fact, viral microRNAs can alter cell physiology and act on the immune system; moreover, cellular microRNAs can regulate the virus cycle, influencing positively or negatively viral replication. Accordingly, microRNAs can represent diagnostic and prognostic biomarkers of infectious processes and a promising approach for designing targeted therapies. In the past 18 months, the COVID-19 infection from SARS-CoV-2 has engaged many researchers in the search for diagnostic and prognostic markers and the development of therapies. Although some research suggests that the SARS-CoV-2 genome can produce microRNAs and that host microRNAs may be involved in the cellular response to the virus, to date, not enough evidence has been provided. In this paper, using a focused bioinformatic approach exploring the SARS-CoV-2 genome, we propose that SARS-CoV-2 is able to produce microRNAs sharing a strong sequence homology with the human ones and also that human microRNAs may target viral RNA regulating the virus life cycle inside human cells. Interestingly, all viral miRNA sequences and some human miRNA target sites are conserved in more recent SARS-CoV-2 variants of concern (VOCs). Even if experimental evidence will be needed, in silico analysis represents a valuable source of information useful to understand the sophisticated molecular mechanisms of disease and to sustain biomedical applications.


Subject(s)
MicroRNAs/genetics , SARS-CoV-2/genetics , Virus Replication/genetics , COVID-19/genetics , Computational Biology/methods , DNA Viruses/genetics , Gene Expression/genetics , Gene Expression Regulation, Viral/genetics , Genome, Viral/genetics , Host-Pathogen Interactions/genetics , RNA, Viral/genetics , Sequence Homology
8.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: covidwho-1528156

ABSTRACT

The low capture rate of expressed RNAs from single-cell sequencing technology is one of the major obstacles to downstream functional genomics analyses. Recently, a number of imputation methods have emerged for single-cell transcriptome data, however, recovering missing values in very sparse expression matrices remains a substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted Decomposition of Gene Expression), to impute gene expression matrices by using a biased low-rank matrix decomposition method. WEDGE successfully recovered expression matrices, reproduced the cell-wise and gene-wise correlations and improved the clustering of cells, performing impressively for applications with sparse datasets. Overall, this study shows a potent approach for imputing sparse expression matrix data, and our WEDGE algorithm should help many researchers to more profitably explore the biological meanings embedded in their single-cell RNA sequencing datasets. The source code of WEDGE has been released at https://github.com/QuKunLab/WEDGE.


Subject(s)
Algorithms , Computational Biology/methods , Gene Expression Profiling/methods , RNA-Seq/methods , Single-Cell Analysis/methods , COVID-19/blood , COVID-19/genetics , COVID-19/virology , Cluster Analysis , Computer Simulation , Genomics/methods , Humans , Leukocytes, Mononuclear/classification , Leukocytes, Mononuclear/metabolism , Reproducibility of Results , SARS-CoV-2/physiology , Severity of Illness Index
9.
Biosci Rep ; 41(10)2021 10 29.
Article in English | MEDLINE | ID: covidwho-1510636

ABSTRACT

Coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus has become a global health emergency. Although new vaccines have been generated and being implicated, discovery and application of novel preventive and control measures are warranted. We aimed to identify compounds that may possess the potential to either block the entry of virus to host cells or attenuate its replication upon infection. Using host cell surface receptor expression (angiotensin-converting enzyme 2 (ACE2) and Transmembrane protease serine 2 (TMPRSS2)) analysis as an assay, we earlier screened several synthetic and natural compounds and identified candidates that showed ability to down-regulate their expression. Here, we report experimental and computational analyses of two small molecules, Mortaparib and MortaparibPlus that were initially identified as dual novel inhibitors of mortalin and PARP-1, for their activity against SARS-CoV-2. In silico analyses showed that MortaparibPlus, but not Mortaparib, stably binds into the catalytic pocket of TMPRSS2. In vitro analysis of control and treated cells revealed that MortaparibPlus caused down-regulation of ACE2 and TMPRSS2; Mortaparib did not show any effect. Furthermore, computational analysis on SARS-CoV-2 main protease (Mpro) that also predicted the inhibitory activity of MortaparibPlus. However, cell-based antiviral drug screening assay showed 30-60% viral inhibition in cells treated with non-toxic doses of either MortaparibPlus or Mortaparib. The data suggest that these two closely related compounds possess multimodal anti-COVID-19 activities. Whereas MortaparibPlus works through direct interactions/effects on the host cell surface receptors (ACE2 and TMPRSS2) and the virus protein (Mpro), Mortaparib involves independent mechanisms, elucidation of which warrants further studies.


Subject(s)
Antiviral Agents/pharmacology , COVID-19/drug therapy , Computational Biology/methods , Angiotensin-Converting Enzyme 2/immunology , Angiotensin-Converting Enzyme 2/metabolism , Antiviral Agents/immunology , COVID-19/immunology , Cell Line, Tumor , Drug Evaluation, Preclinical/methods , HSP70 Heat-Shock Proteins/antagonists & inhibitors , Humans , Mitochondrial Proteins/antagonists & inhibitors , Poly (ADP-Ribose) Polymerase-1/antagonists & inhibitors , SARS-CoV-2/immunology , Serine Endopeptidases/immunology , Serine Endopeptidases/metabolism , Spike Glycoprotein, Coronavirus/metabolism , Virus Internalization/drug effects
10.
Cell Rep ; 37(7): 110020, 2021 11 16.
Article in English | MEDLINE | ID: covidwho-1509641

ABSTRACT

Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types.


Subject(s)
COVID-19/genetics , SARS-CoV-2/genetics , Chromosome Mapping/methods , Computational Biology/methods , Databases, Genetic , Gene Expression/genetics , Gene Expression Profiling/methods , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Genome-Wide Association Study/methods , Humans , Organ Specificity/genetics , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , SARS-CoV-2/pathogenicity , Severity of Illness Index , Transcriptome/genetics
11.
Sci Rep ; 11(1): 21084, 2021 10 26.
Article in English | MEDLINE | ID: covidwho-1493213

ABSTRACT

In contrast to the conventional approach of directly comparing genomic sequences using sequence alignment tools, we propose a computational approach that performs comparisons between sequence generators. These sequence generators are learned via a data-driven approach that empirically computes the state machine generating the genomic sequence of interest. As the state machine based generator of the sequence is independent of the sequence length, it provides us with an efficient method to compute the statistical distance between large sets of genomic sequences. Moreover, our technique provides a fast and efficient method to cluster large datasets of genomic sequences, characterize their temporal and spatial evolution in a continuous manner, get insights into the locality sensitive information about the sequences without any need for alignment. Furthermore, we show that the technique can be used to detect local regions with mutation activity, which can then be applied to aid alignment techniques for the fast discovery of mutations. To demonstrate the efficacy of our technique on real genomic data, we cluster different strains of SARS-CoV-2 viral sequences, characterize their evolution and identify regions of the viral sequence with mutations.


Subject(s)
COVID-19/virology , Computational Biology/methods , Genomics , Mutation , SARS-CoV-2/genetics , Algorithms , Cluster Analysis , DNA Mutational Analysis , Genome, Viral , Humans , Machine Learning , Models, Theoretical , Probability , Stochastic Processes
12.
Sci Rep ; 11(1): 21068, 2021 10 26.
Article in English | MEDLINE | ID: covidwho-1493208

ABSTRACT

Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named 'virus variant', requires scrutiny (as variants may hugely impact the agent's transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar-rapidly growing-dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants' emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic.


Subject(s)
COVID-19/prevention & control , COVID-19/therapy , COVID-19/virology , SARS-CoV-2/genetics , Amino Acids/metabolism , Cluster Analysis , Computational Biology/methods , Data Mining , Europe/epidemiology , Genome, Viral , Humans , Japan/epidemiology , Phylogeny , Time Factors , United States/epidemiology
13.
Sci Rep ; 11(1): 21108, 2021 10 26.
Article in English | MEDLINE | ID: covidwho-1493205

ABSTRACT

SARS-CoV-2, the virus causing the COVID-19 pandemic emerged in December 2019 in China and raised fears it could overwhelm healthcare systems worldwide. Mutations of the virus are monitored by the GISAID database from which we downloaded sequences from four West African countries Ghana, Gambia, Senegal and Nigeria from February 2020 to April 2020. We subjected the sequences to phylogenetic analysis employing the nextstrain pipeline. We found country-specific patterns of viral variants and supplemented that with data on novel variants from June 2021. Until April 2020, variants carrying the crucial Europe-associated D614G amino acid change were predominantly found in Senegal and Gambia, and combinations of late variants with and early variants without D614G in Ghana and Nigeria. In June 2021 all variants carried the D614G amino acid substitution. Senegal and Gambia exhibited again variants transmitted from Europe (alpha or delta), Ghana a combination of several variants and in Nigeria the original Eta variant. Detailed analysis of distinct samples revealed that some might have circulated latently and some reflect migration routes. The distinct patterns of variants within the West African countries point at their global transmission via air traffic predominantly from Europe and only limited transmission between the West African countries.


Subject(s)
COVID-19/transmission , COVID-19/virology , Computational Biology/methods , Mutation , SARS-CoV-2 , Africa, Western , Biodiversity , China , Europe , Gambia , Genetic Variation , Genome, Viral , Geography , Ghana , Humans , Nigeria , Phylogeny , Senegal , Time Factors
14.
J Biomed Semantics ; 12(1): 13, 2021 07 18.
Article in English | MEDLINE | ID: covidwho-1484319

ABSTRACT

BACKGROUND: Effective response to public health emergencies, such as we are now experiencing with COVID-19, requires data sharing across multiple disciplines and data systems. Ontologies offer a powerful data sharing tool, and this holds especially for those ontologies built on the design principles of the Open Biomedical Ontologies Foundry. These principles are exemplified by the Infectious Disease Ontology (IDO), a suite of interoperable ontology modules aiming to provide coverage of all aspects of the infectious disease domain. At its center is IDO Core, a disease- and pathogen-neutral ontology covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is extended by disease and pathogen-specific ontology modules. RESULTS: To assist the integration and analysis of COVID-19 data, and viral infectious disease data more generally, we have recently developed three new IDO extensions: IDO Virus (VIDO); the Coronavirus Infectious Disease Ontology (CIDO); and an extension of CIDO focusing on COVID-19 (IDO-COVID-19). Reflecting the fact that viruses lack cellular parts, we have introduced into IDO Core the term acellular structure to cover viruses and other acellular entities studied by virologists. We now distinguish between infectious agents - organisms with an infectious disposition - and infectious structures - acellular structures with an infectious disposition. This in turn has led to various updates and refinements of IDO Core's content. We believe that our work on VIDO, CIDO, and IDO-COVID-19 can serve as a model for yielding greater conformance with ontology building best practices. CONCLUSIONS: IDO provides a simple recipe for building new pathogen-specific ontologies in a way that allows data about novel diseases to be easily compared, along multiple dimensions, with data represented by existing disease ontologies. The IDO strategy, moreover, supports ontology coordination, providing a powerful method of data integration and sharing that allows physicians, researchers, and public health organizations to respond rapidly and efficiently to current and future public health crises.


Subject(s)
Biological Ontologies/statistics & numerical data , COVID-19/prevention & control , Communicable Disease Control/statistics & numerical data , Communicable Diseases/therapy , Computational Biology/statistics & numerical data , SARS-CoV-2/isolation & purification , COVID-19/epidemiology , COVID-19/virology , Communicable Disease Control/methods , Communicable Diseases/epidemiology , Communicable Diseases/transmission , Computational Biology/methods , Data Mining/methods , Data Mining/statistics & numerical data , Epidemics , Humans , Information Dissemination/methods , Public Health/methods , Public Health/statistics & numerical data , SARS-CoV-2/physiology , Semantics
15.
J Comput Biol ; 28(11): 1113-1129, 2021 11.
Article in English | MEDLINE | ID: covidwho-1483349

ABSTRACT

The availability of millions of SARS-CoV-2 (Severe Acute Respiratory Syndrome-Coronavirus-2) sequences in public databases such as GISAID (Global Initiative on Sharing All Influenza Data) and EMBL-EBI (European Molecular Biology Laboratory-European Bioinformatics Institute) (the United Kingdom) allows a detailed study of the evolution, genomic diversity, and dynamics of a virus such as never before. Here, we identify novel variants and subtypes of SARS-CoV-2 by clustering sequences in adapting methods originally designed for haplotyping intrahost viral populations. We asses our results using clustering entropy-the first time it has been used in this context. Our clustering approach reaches lower entropies compared with other methods, and we are able to boost this even further through gap filling and Monte Carlo-based entropy minimization. Moreover, our method clearly identifies the well-known Alpha variant in the U.K. and GISAID data sets, and is also able to detect the much less represented (<1% of the sequences) Beta (South Africa), Epsilon (California), and Gamma and Zeta (Brazil) variants in the GISAID data set. Finally, we show that each variant identified has high selective fitness, based on the growth rate of its cluster over time. This demonstrates that our clustering approach is a viable alternative for detecting even rare subtypes in very large data sets.


Subject(s)
Cluster Analysis , Computational Biology/methods , Brazil , Databases, Genetic , Entropy , Humans , Monte Carlo Method , South Africa , United Kingdom , United States
16.
Sci Rep ; 11(1): 20987, 2021 10 25.
Article in English | MEDLINE | ID: covidwho-1483149

ABSTRACT

Acid suppressants are widely-used classes of medications linked to increased risks of aerodigestive infections. Prior studies of these medications as potentially reversible risk factors for COVID-19 have been conflicting. We aimed to determine the impact of chronic acid suppression use on COVID-19 infection risk while simultaneously evaluating the influence of social determinants of health to validate known and discover novel risk factors. We assessed the association of chronic acid suppression with incident COVID-19 in a 1:1 case-control study of 900 patients tested across three academic medical centers in California, USA. Medical comorbidities and history of chronic acid suppression use were manually extracted from health records by physicians following a pre-specified protocol. Socio-behavioral factors by geomapping publicly-available data to patient zip codes were incorporated. We identified no evidence to support an association between chronic acid suppression and COVID-19 (adjusted odds ratio 1.04, 95% CI 0.92-1.17, P = 0.515). However, several medical and social features were positive (Latinx ethnicity, BMI ≥ 30, dementia, public transportation use, month of the pandemic) and negative (female sex, concurrent solid tumor, alcohol use disorder) predictors of new infection. These findings demonstrate the value of integrating publicly-available databases with medical data to identify critical features of communicable diseases.


Subject(s)
COVID-19/epidemiology , COVID-19/therapy , Gastroesophageal Reflux/complications , Social Determinants of Health , Aged , Behavior , COVID-19/psychology , California , Case-Control Studies , Computational Biology/methods , Databases, Factual , Female , Gastroenterology , Gastroesophageal Reflux/drug therapy , Geography , Histamine H2 Antagonists/pharmacology , Humans , Incidence , Male , Middle Aged , Odds Ratio , Proton Pump Inhibitors/pharmacology , Risk Factors , Social Class
17.
Sci Rep ; 11(1): 20864, 2021 10 21.
Article in English | MEDLINE | ID: covidwho-1479817

ABSTRACT

Following SARS-CoV-2 infection, some COVID-19 patients experience severe host driven adverse events. To treat these complications, their underlying etiology and drug treatments must be identified. Thus, a novel AI methodology MOATAI-VIR, which predicts disease-protein-pathway relationships and repurposed FDA-approved drugs to treat COVID-19's clinical manifestations was developed. SARS-CoV-2 interacting human proteins and GWAS identified respiratory failure genes provide the input from which the mode-of-action (MOA) proteins/pathways of the resulting disease comorbidities are predicted. These comorbidities are then mapped to their clinical manifestations. To assess each manifestation's molecular basis, their prioritized shared proteins were subject to global pathway analysis. Next, the molecular features associated with hallmark COVID-19 phenotypes, e.g. unusual neurological symptoms, cytokine storms, and blood clots were explored. In practice, 24/26 of the major clinical manifestations are successfully predicted. Three major uncharacterized manifestation categories including neoplasms are also found. The prevalence of neoplasms suggests that SARS-CoV-2 might be an oncovirus due to shared molecular mechanisms between oncogenesis and viral replication. Then, repurposed FDA-approved drugs that might treat COVID-19's clinical manifestations are predicted by virtual ligand screening of the most frequent comorbid protein targets. These drugs might help treat both COVID-19's severe adverse events and lesser ones such as loss of taste/smell.


Subject(s)
COVID-19/complications , COVID-19/diagnosis , COVID-19/drug therapy , Computational Biology/methods , Neoplasms/complications , Nervous System Diseases/complications , Thrombosis/complications , Virus Replication , Benchmarking , Comorbidity , Computer Simulation , Cytokine Release Syndrome , Drug Discovery , Humans , Machine Learning , Molecular Medicine , Phenotype , SARS-CoV-2 , Treatment Outcome
18.
Mol Syst Biol ; 17(10): e10387, 2021 10.
Article in English | MEDLINE | ID: covidwho-1478718

ABSTRACT

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.


Subject(s)
COVID-19/immunology , Computational Biology/methods , Databases, Factual , SARS-CoV-2/immunology , Software , Antiviral Agents/therapeutic use , COVID-19/drug therapy , COVID-19/genetics , COVID-19/virology , Computer Graphics , Cytokines/genetics , Cytokines/immunology , Data Mining/statistics & numerical data , Gene Expression Regulation , Host Microbial Interactions/genetics , Host Microbial Interactions/immunology , Humans , Immunity, Cellular/drug effects , Immunity, Humoral/drug effects , Immunity, Innate/drug effects , Lymphocytes/drug effects , Lymphocytes/immunology , Lymphocytes/virology , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/immunology , Myeloid Cells/drug effects , Myeloid Cells/immunology , Myeloid Cells/virology , Protein Interaction Mapping , SARS-CoV-2/drug effects , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Signal Transduction , Transcription Factors/genetics , Transcription Factors/immunology , Viral Proteins/genetics , Viral Proteins/immunology
19.
Biomed Res Int ; 2021: 9982729, 2021.
Article in English | MEDLINE | ID: covidwho-1476892

ABSTRACT

The human transmembrane protease serine 2 (TMPRSS2) protein plays an important role in prostate cancer progression. It also facilitates viral entry into target cells by proteolytically cleaving and activating the S protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In the current study, we used different available tools like SIFT, PolyPhen2.0, PROVEAN, SNAP2, PMut, MutPred2, I-Mutant Suite, MUpro, iStable, ConSurf, ModPred, SwissModel, PROCHECK, Verify3D, and TM-align to identify the most deleterious variants and to explore possible effects on the TMPRSS2 stability, structure, and function. The six missense variants tested were evaluated to have deleterious effects on the protein by SIFT, PolyPhen2.0, PROVEAN, SNAP2, and PMut. Additionally, V160M, G181R, R240C, P335L, G432A, and D435Y variants showed a decrease in stability by at least 2 servers; G181R, G432A, and D435Y are highly conserved and identified posttranslational modifications sites (PTMs) for proteolytic cleavage and ADP-ribosylation using ConSurf and ModPred servers. The 3D structure of TMPRSS2 native and mutants was generated using 7 meq as a template from the SwissModeller group, refined by ModRefiner, and validated using the Ramachandran plot. Hence, this paper can be advantageous to understand the association between these missense variants rs12329760, rs781089181, rs762108701, rs1185182900, rs570454392, and rs867186402 and susceptibility to SARS-CoV-2.


Subject(s)
COVID-19/genetics , Mutation, Missense , Serine Endopeptidases/chemistry , Serine Endopeptidases/genetics , Binding Sites , Computational Biology/methods , Evolution, Molecular , Genetic Predisposition to Disease , Humans , Models, Molecular , Phylogeny , Polymorphism, Single Nucleotide , Protein Conformation , Protein Stability , Serine Endopeptidases/metabolism
20.
Comput Math Methods Med ; 2021: 7196492, 2021.
Article in English | MEDLINE | ID: covidwho-1476882

ABSTRACT

COVID-19 has swept through the world since December 2019 and caused a large number of patients and deaths. Spatial prediction on the spread of the epidemic is greatly important for disease control and management. In this study, we predicted the cumulative confirmed cases (CCCs) from Jan 17 to Mar 1, 2020, in mainland China at the city level, using machine learning algorithms, geographically weighted regression (GWR), and partial least squares regression (PLSR) based on population flow, geolocation, meteorological, and socioeconomic variables. The validation results showed that machine learning algorithms and GWR achieved good performances. These models could not effectively predict CCCs in Wuhan, the first city that reported COVID-19 cases in China, but performed well in other cities. Random Forest (RF) outperformed other methods with a CV-R 2 of 0.84. In this model, the population flow from Wuhan to other cities (WP) was the most important feature and the other features also made considerable contributions to the prediction accuracy. Compared with RF, GWR showed a slightly worse performance (CV-R 2 = 0.81) but required fewer spatial independent variables. This study explored the spatial prediction of the epidemic based on multisource spatial independent variables, providing references for the estimation of CCCs in the regions lacking accurate and timely.


Subject(s)
COVID-19/diagnosis , COVID-19/epidemiology , Computational Biology/methods , Geography , Machine Learning , Algorithms , China/epidemiology , Cities , Climate , Communicable Diseases , Environmental Monitoring , Epidemics , Humans , Least-Squares Analysis , Models, Statistical , Reproducibility of Results , SARS-CoV-2 , Social Class , Spatial Regression
SELECTION OF CITATIONS
SEARCH DETAIL
...