Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Bioinformatics ; 2022 Jan 18.
Article in English | MEDLINE | ID: covidwho-1631198

ABSTRACT

MOTIVATION: The ongoing evolution of SARS-CoV-2 and the rapid emergence of variants of concern (VOCs) at distinct geographic locations have relevant implications for the implementation of strategies for controlling the COVID-19 pandemic. Combining the growing body of data and the evidence on potential functional implications of SARS-CoV-2 mutations can suggest highly effective methods for the prioritization of novel variants of potential concern, e.g., increasing in frequency locally and/or globally. However, these analyses may be complex, requiring the integration of different data and resources. We claim the need for a streamlined access to up-to-date and high-quality genome sequencing data from different geographic regions/countries, and the current lack of a robust and consistent framework for the evaluation/comparison of the results. RESULTS: To overcome these limitations, we developed ViruClust, a novel tool for the comparison of SARS-CoV-2 genomic sequences and lineages in space and time. ViruClust is made available through a powerful and intuitive web-based user interface. Sophisticated large scale analyses can be executed with a few clicks, even by users without any computational background. To demonstrate potential applications of our method, we applied ViruClust to conduct a thorough study of the evolution of the most prevalent lineage of the Delta SARS-CoV-2 variant, and derived relevant observations. Conclusions By allowing the seamless integration of different types of functional annotations and the direct comparison of viral genomes and genetic variants in space and time, ViruClust represents a highly valuable resource for monitoring the evolution of SARS-CoV-2, facilitating the identification of variants and/or mutations of potential concern. AVAILABILITY: ViruClust is openly available at http://gmql.eu/viruclust/.

2.
BioTech ; 10(4):27, 2021.
Article in English | MDPI | ID: covidwho-1502365

ABSTRACT

Since the beginning of 2020, the COVID-19 pandemic has posed unprecedented challenges to viral data analysis and connected host disease diagnostic methods. We propose VirusLab, a flexible system for analysing SARS-CoV-2 viral sequences and relating them to metadata or clinical information about the host. VirusLab capitalizes on two existing resources: ViruSurf, a database of public SARS-CoV-2 sequences supporting metadata-driven search, and VirusViz, a tool for visual analysis of search results. VirusLab is designed for taking advantage of these resources within a server-side architecture that: (i) covers pipelines based on approaches already in use (ARTIC, Galaxy) but entirely cutomizable upon user request;(ii) predigests analysis of raw sequencing data from different platforms (Oxford Nanopore and Illumina);(iii) gives access to public archives datasets;(iv) supplies user-friendly reporting – making it a tool that can also be integrated into a business environment. VirusLab can be installed and hosted within the premises of any organization where information about SARS-CoV-2 sequences can be safely integrated with information about hosts (e.g., clinical metadata). A system such as VirusLab is not currently available in the landscape of similar providers: our results show that VirusLab is a powerful tool to generate tabular/graphical and machine readable reports that can be integrated in more complex pipelines. We foresee that the proposed system can support many research-oriented and therapeutic scenarios within hospitals or the tracing of viral sequences and their mutational processes within organizations for viral surveillance.

3.
Sci Rep ; 11(1): 21174, 2021 10 27.
Article in English | MEDLINE | ID: covidwho-1493227

ABSTRACT

Lockdowns implemented to address the COVID-19 pandemic have disrupted human mobility flows around the globe to an unprecedented extent and with economic consequences which are unevenly distributed across territories, firms and individuals. Here we study socioeconomic determinants of mobility disruption during both the lockdown and the recovery phases in Italy. For this purpose, we analyze a massive data set on Italian mobility from February to October 2020 and we combine it with detailed data on pre-existing local socioeconomic features of Italian administrative units. Using a set of unsupervised and supervised learning techniques, we reliably show that the least and the most affected areas persistently belong to two different clusters. Notably, the former cluster features significantly higher income per capita and lower income inequality than the latter. This distinction persists once the lockdown is lifted. The least affected areas display a swift (V-shaped) recovery in mobility patterns, while poorer, most affected areas experience a much slower (U-shaped) recovery: as of October 2020, their mobility was still significantly lower than pre-lockdown levels. These results are then detailed and confirmed with a quantile regression analysis. Our findings show that economic segregation has, thus, strengthened during the pandemic.


Subject(s)
COVID-19/epidemiology , Pandemics , SARS-CoV-2 , COVID-19/economics , Communicable Disease Control/economics , Communicable Disease Control/methods , Humans , Income , Italy/epidemiology , Machine Learning , Pandemics/economics , Poverty , Quarantine/economics , Regression Analysis , Socioeconomic Factors , Travel
4.
Sci Rep ; 11(1): 21068, 2021 10 26.
Article in English | MEDLINE | ID: covidwho-1493208

ABSTRACT

Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named 'virus variant', requires scrutiny (as variants may hugely impact the agent's transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar-rapidly growing-dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants' emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic.


Subject(s)
COVID-19/prevention & control , COVID-19/therapy , COVID-19/virology , SARS-CoV-2/genetics , Amino Acids/metabolism , Cluster Analysis , Computational Biology/methods , Data Mining , Europe/epidemiology , Genome, Viral , Humans , Japan/epidemiology , Phylogeny , Time Factors , United States/epidemiology
5.
Database (Oxford) ; 20212021 09 29.
Article in English | MEDLINE | ID: covidwho-1443040

ABSTRACT

EpiSurf is a Web application for selecting viral populations of interest and then analyzing how their amino acid changes are distributed along epitopes. Viral sequences are searched within ViruSurf, which stores curated metadata and amino acid changes imported from the most widely used deposition sources for viral databases (GenBank, COVID-19 Genomics UK (COG-UK) and Global initiative on sharing all influenza data (GISAID)). Epitopes are searched within the open source Immune Epitope Database or directly proposed by users by indicating their start and stop positions in the context of a given viral protein. Amino acid changes of selected populations are joined with epitopes of interest; a result table summarizes, for each epitope, statistics about the overlapping amino acid changes and about the sequences carrying such alterations. The results may also be inspected by the VirusViz Web application; epitope regions are highlighted within the given viral protein, and changes can be comparatively inspected. For sequences mutated within the epitope, we also offer a complete view of the distribution of amino acid changes, optionally grouped by the location, collection date or lineage. Thanks to these functionalities, EpiSurf supports the user-friendly testing of epitope conservancy within selected populations of interest, which can be of utmost relevance for designing vaccines, drugs or serological assays. EpiSurf is available at two endpoints. Database URL: http://gmql.eu/episurf/ (for searching GenBank and COG-UK sequences) and http://gmql.eu/episurf_gisaid/ (for GISAID sequences).


Subject(s)
Amino Acid Substitution , Antigens, Viral/chemistry , Epitopes/chemistry , Internet , Metadata , SARS-CoV-2/chemistry , Search Engine , Software , Amino Acids/chemistry , Amino Acids/immunology , Antigens, Viral/immunology , COVID-19/virology , Epitopes/immunology , Humans , SARS-CoV-2/immunology
6.
Brief Bioinform ; 22(2): 664-675, 2021 03 22.
Article in English | MEDLINE | ID: covidwho-1352113

ABSTRACT

With the outbreak of the COVID-19 disease, the research community is producing unprecedented efforts dedicated to better understand and mitigate the effects of the pandemic. In this context, we review the data integration efforts required for accessing and searching genome sequences and metadata of SARS-CoV2, the virus responsible for the COVID-19 disease, which have been deposited into the most important repositories of viral sequences. Organizations that were already present in the virus domain are now dedicating special interest to the emergence of COVID-19 pandemics, by emphasizing specific SARS-CoV2 data and services. At the same time, novel organizations and resources were born in this critical period to serve specifically the purposes of COVID-19 mitigation while setting the research ground for contrasting possible future pandemics. Accessibility and integration of viral sequence data, possibly in conjunction with the human host genotype and clinical data, are paramount to better understand the COVID-19 disease and mitigate its effects. Few examples of host-pathogen integrated datasets exist so far, but we expect them to grow together with the knowledge of COVID-19 disease; once such datasets will be available, useful integrative surveillance mechanisms can be put in place by observing how common variants distribute in time and space, relating them to the phenotypic impact evidenced in the literature.


Subject(s)
COVID-19/therapy , COVID-19/epidemiology , COVID-19/virology , Genes, Viral , Humans , Information Storage and Retrieval , Pandemics , SARS-CoV-2/genetics , SARS-CoV-2/isolation & purification
7.
Nucleic Acids Res ; 49(15): e90, 2021 09 07.
Article in English | MEDLINE | ID: covidwho-1262154

ABSTRACT

Variant visualization plays an important role in supporting the viral evolution analysis, extremely valuable during the COVID-19 pandemic. VirusViz is a web-based application for comparing variants of selected viral populations and their sub-populations; it is primarily focused on SARS-CoV-2 variants, although the tool also supports other viral species (SARS-CoV, MERS-CoV, Dengue, Ebola). As input, VirusViz imports results of queries extracting variants and metadata from the large database ViruSurf, which integrates information about most SARS-CoV-2 sequences publicly deposited worldwide. Moreover, VirusViz accepts sequences of new viral populations as multi-FASTA files plus corresponding metadata in CSV format; a bioinformatic pipeline builds a suitable input for VirusViz by extracting the nucleotide and amino acid variants. Pages of VirusViz provide metadata summarization, variant descriptions, and variant visualization with rich options for zooming, highlighting variants or regions of interest, and switching from nucleotides to amino acids; sequences can be grouped, groups can be comparatively analyzed. For SARS-CoV-2, we manually collect mutations with known or predicted levels of severity/virulence, as indicated in linked research articles; such critical mutations are reported when observed in sequences. The system includes light-weight project management for downloading, resuming, and merging data analysis sessions. VirusViz is freely available at http://gmql.eu/virusviz/.


Subject(s)
COVID-19/virology , Data Visualization , SARS-CoV-2/chemistry , SARS-CoV-2/genetics , Amino Acid Sequence , Base Sequence , Databases, Factual , Humans , Knowledge Bases , SARS-CoV-2/classification , South Africa/epidemiology , United States/epidemiology
8.
Eur J Hum Genet ; 29(5): 745-759, 2021 05.
Article in English | MEDLINE | ID: covidwho-1033853

ABSTRACT

Within the GEN-COVID Multicenter Study, biospecimens from more than 1000 SARS-CoV-2 positive individuals have thus far been collected in the GEN-COVID Biobank (GCB). Sample types include whole blood, plasma, serum, leukocytes, and DNA. The GCB links samples to detailed clinical data available in the GEN-COVID Patient Registry (GCPR). It includes hospitalized patients (74.25%), broken down into intubated, treated by CPAP-biPAP, treated with O2 supplementation, and without respiratory support (9.5%, 18.4%, 31.55% and 14.8, respectively); and non-hospitalized subjects (25.75%), either pauci- or asymptomatic. More than 150 clinical patient-level data fields have been collected and binarized for further statistics according to the organs/systems primarily affected by COVID-19: heart, liver, pancreas, kidney, chemosensors, innate or adaptive immunity, and clotting system. Hierarchical clustering analysis identified five main clinical categories: (1) severe multisystemic failure with either thromboembolic or pancreatic variant; (2) cytokine storm type, either severe with liver involvement or moderate; (3) moderate heart type, either with or without liver damage; (4) moderate multisystemic involvement, either with or without liver damage; (5) mild, either with or without hyposmia. GCB and GCPR are further linked to the GCGDR, which includes data from whole-exome sequencing and high-density SNP genotyping. The data are available for sharing through the Network for Italian Genomes, found within the COVID-19 dedicated section. The study objective is to systematize this comprehensive data collection and begin identifying multi-organ involvement in COVID-19, defining genetic parameters for infection susceptibility within the population, and mapping genetically COVID-19 severity and clinical complexity among patients.


Subject(s)
Biological Specimen Banks , COVID-19/genetics , Genetic Predisposition to Disease , Registries , SARS-CoV-2 , Specimen Handling , Adolescent , Adult , COVID-19/epidemiology , Female , Humans , Italy , Male
9.
Conceptual Modeling, Er 2020 ; 12400:388-402, 2020.
Article in English | Web of Science | ID: covidwho-938548

ABSTRACT

The pandemic outbreak of the coronavirus disease has attracted attention towards the genetic mechanisms of viruses. We hereby present the Viral Conceptual Model (VCM), centered on the virus sequence and described from four perspectives: biological (virus type and hosts/sample), analytical (annotations, nucleotide and amino acid variants), organizational (sequencing project) and technical (experimental technology). VCM is inspired by GCM, our previously developed Genomic Conceptual Model, but it introduces many novel concepts, as viral sequences significantly differ from human genomes. When applied to SARS-CoV-2 virus, complex conceptual queries upon VCM are able to replicate the search results of recent articles, hence demonstrating huge potential in supporting virology research. Our effort is part of a broad vision: availability of conceptual models for both human genomics and viruses will provide important opportunities for research, especially if interconnected by the same human being, playing the role of virus host as well as provider of genomic and phenotype information.

10.
Nucleic Acids Res ; 49(D1): D817-D824, 2021 01 08.
Article in English | MEDLINE | ID: covidwho-851820

ABSTRACT

ViruSurf, available at http://gmql.eu/virusurf/, is a large public database of viral sequences and integrated and curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK and NMDC); it also exposes computed nucleotide and amino acid variants, called from original sequences. A GISAID-specific ViruSurf database, available at http://gmql.eu/virusurf_gisaid/, offers a subset of these functionalities. Given the current pandemic outbreak, SARS-CoV-2 data are collected from the four sources; but ViruSurf contains other virus species harmful to humans, including SARS-CoV, MERS-CoV, Ebola and Dengue. The database is centered on sequences, described from their biological, technological and organizational dimensions. In addition, the analytical dimension characterizes the sequence in terms of its annotations and variants. The web interface enables expressing complex search queries in a simple way; arbitrary search queries can freely combine conditions on attributes from the four dimensions, extracting the resulting sequences. Several example queries on the database confirm and possibly improve results from recent research papers; results can be recomputed over time and upon selected populations. Effective search over large and curated sequence data may enable faster responses to future threats that could arise from new viruses.


Subject(s)
COVID-19/prevention & control , Computational Biology/methods , Data Curation/methods , Databases, Genetic , Genome, Viral/genetics , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/virology , Genetic Variation , Humans , Information Storage and Retrieval/methods , Internet , Pandemics , SARS-CoV-2/physiology , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...