BMC Bioinformatics ; 24(1): 159, 2023 Apr 20.
BACKGROUND: Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging could benefit biomedical research resource providers, but it can be challenging to apply standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize or other biomedical schema projects. RESULTS: Our browser-based tool includes features which can help address many of the barriers towards such as: The ability to easily browse for relevant classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the schema-a large multi-class schema for harmonizing various COVID-19 related resources. CONCLUSIONS: We have created a browser-based tool to empower biomedical research resource providers to leverage classes to make their research outputs more FAIR.

Biomedical Research , COVID-19 , Humans , Metadata
Nat Methods ; 20(4): 536-540, 2023 04.
ABSTRACT Research Library is a standardized, searchable interface of coronavirus disease 2019 (COVID-19) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) publications, clinical trials, datasets, protocols and other resources, built with a reusable framework. We developed a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public application programming interface (API) and R package.

COVID-19 , Humans , SARS-CoV-2 , Disease Outbreaks
Nat Methods ; 20(4): 512-522, 2023 04.
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.

COVID-19 , SARS-CoV-2 , Humans , Genomics , Disease Outbreaks , Mutation
Nat Commun ; 13(1): 4784, 2022 08 15.
Regional connectivity and land travel have been identified as important drivers of SARS-CoV-2 transmission. However, the generalizability of this finding is understudied outside of well-sampled, highly connected regions. In this study, we investigated the relative contributions of regional and intercontinental connectivity to the source-sink dynamics of SARS-CoV-2 for Jordan and the Middle East. By integrating genomic, epidemiological and travel data we show that the source of introductions into Jordan was dynamic across 2020, shifting from intercontinental seeding in the early pandemic to more regional seeding for the travel restrictions period. We show that land travel, particularly freight transport, drove introduction risk during the travel restrictions period. High regional connectivity and land travel also drove Jordan's export risk. Our findings emphasize regional connectedness and land travel as drivers of transmission in the Middle East.

COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Humans , Middle East/epidemiology , Pandemics/prevention & control , Travel
Cell ; 184(19): 4939-4952.e15, 2021 09 16.
The emergence of the COVID-19 epidemic in the United States (U.S.) went largely undetected due to inadequate testing. New Orleans experienced one of the earliest and fastest accelerating outbreaks, coinciding with Mardi Gras. To gain insight into the emergence of SARS-CoV-2 in the U.S. and how large-scale events accelerate transmission, we sequenced SARS-CoV-2 genomes during the first wave of the COVID-19 epidemic in Louisiana. We show that SARS-CoV-2 in Louisiana had limited diversity compared to other U.S. states and that one introduction of SARS-CoV-2 led to almost all of the early transmission in Louisiana. By analyzing mobility and genomic data, we show that SARS-CoV-2 was already present in New Orleans before Mardi Gras, and the festival dramatically accelerated transmission. Our study provides an understanding of how superspreading during large-scale events played a key role during the early outbreak in the U.S. and can greatly accelerate epidemics.

COVID-19/epidemiology , Epidemics , SARS-CoV-2/physiology , COVID-19/transmission , Databases as Topic , Disease Outbreaks , Humans , Louisiana/epidemiology , Phylogeny , Risk Factors , SARS-CoV-2/classification , Texas , Travel , United States/epidemiology
Cell ; 184(10): 2587-2594.e7, 2021 05 13.
The highly transmissible B.1.1.7 variant of SARS-CoV-2, first identified in the United Kingdom, has gained a foothold across the world. Using S gene target failure (SGTF) and SARS-CoV-2 genomic sequencing, we investigated the prevalence and dynamics of this variant in the United States (US), tracking it back to its early emergence. We found that, while the fraction of B.1.1.7 varied by state, the variant increased at a logistic rate with a roughly weekly doubling rate and an increased transmission of 40%-50%. We revealed several independent introductions of B.1.1.7 into the US as early as late November 2020, with community transmission spreading it to most states within months. We show that the US is on a similar trajectory as other countries where B.1.1.7 became dominant, requiring immediate and decisive action to minimize COVID-19 morbidity and mortality.

COVID-19 , Models, Biological , SARS-CoV-2 , COVID-19/genetics , COVID-19/mortality , COVID-19/transmission , Female , Humans , Male , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , SARS-CoV-2/pathogenicity , United States/epidemiology