Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 159
Filter
1.
Article in English | MEDLINE | ID: mdl-38780898

ABSTRACT

BACKGROUND: High-grade serous carcinoma (HGSC) gene expression subtypes are associated with differential survival. We characterized HGSC gene expression in Black individuals and considered whether gene expression differences by self-identified race may contribute to poorer HGSC survival among Black versus White individuals. METHODS: We included newly generated RNA-Seq data from Black and White individuals, and array-based genotyping data from four existing studies of White and Japanese individuals. We used K-means clustering, a method with no predefined number of clusters or dataset-specific features, to assign subtypes. Cluster- and dataset-specific gene expression patterns were summarized by moderated t-scores. We compared cluster-specific gene expression patterns across datasets by calculating the correlation between the summarized vectors of moderated t-scores. Following mapping to The Cancer Genome Atlas (TCGA)-derived HGSC subtypes, we used Cox proportional hazards models to estimate subtype-specific survival by dataset. RESULTS: Cluster-specific gene expression was similar across gene expression platforms and racial groups. Comparing the Black population to the White and Japanese populations, the immunoreactive subtype was more common (39% versus 23%-28%) and the differentiated subtype less common (7% versus 22%-31%). Patterns of subtype-specific survival were similar between the Black and White populations with RNA-Seq data; compared to mesenchymal cases, the risk of death was similar for proliferative and differentiated cases and suggestively lower for immunoreactive cases (Black population HR=0.79 [0.55, 1.13], White population HR=0.86 [0.62, 1.19]). CONCLUSIONS: While the prevalence of HGSC subtypes varied by race, subtype-specific survival was similar. IMPACT: HGSC subtypes can be consistently assigned across platforms and self-identified racial groups.

2.
Elife ; 122024 May 28.
Article in English | MEDLINE | ID: mdl-38804191

ABSTRACT

Science journalism is a critical way for the public to learn about and benefit from scientific findings. Such journalism shapes the public's view of the current state of science and legitimizes experts. Journalists can only cite and quote a limited number of sources, who they may discover in their research, including recommendations by other scientists. Biases in either process may influence who is identified and ultimately included as a source. To examine potential biases in science journalism, we analyzed 22,001 non-research articles published by Nature and compared these with Nature-published research articles with respect to predicted gender and name origin. We extracted cited authors' names and those of quoted speakers. While citations and quotations within a piece do not reflect the entire information-gathering process, they can provide insight into the demographics of visible sources. We then predicted gender and name origin of the cited authors and speakers. We compared articles with a comparator set made up of first and last authors within primary research articles in Nature and a subset of Springer Nature articles in the same time period. In our analysis, we found a skew toward quoting men in Nature science journalism. However, quotation is trending toward equal representation at a faster rate than authorship rates in academic publishing. Gender disparity in Nature quotes was dependent on the article type. We found a significant over-representation of names with predicted Celtic/English origin and under-representation of names with a predicted East Asian origin in both in extracted quotes and journal citations but dampened in citations.


Subject(s)
Journalism , Humans , Male , Female , Science , Authorship , Sex Factors , Periodicals as Topic/statistics & numerical data , Bibliometrics , Sexism/statistics & numerical data
3.
eNeuro ; 11(6)2024 Jun.
Article in English | MEDLINE | ID: mdl-38789274

ABSTRACT

High-throughput gene expression profiling measures individual gene expression across conditions. However, genes are regulated in complex networks, not as individual entities, limiting the interpretability of gene expression data. Machine learning models that incorporate prior biological knowledge are a powerful tool to extract meaningful biology from gene expression data. Pathway-level information extractor (PLIER) is an unsupervised machine learning method that defines biological pathways by leveraging the vast amount of published transcriptomic data. PLIER converts gene expression data into known pathway gene sets, termed latent variables (LVs), to substantially reduce data dimensionality and improve interpretability. In the current study, we trained the first mouse PLIER model on 190,111 mouse brain RNA-sequencing samples, the greatest amount of training data ever used by PLIER. We then validated the mousiPLIER approach in a study of microglia and astrocyte gene expression across mouse brain aging. mousiPLIER identified biological pathways that are significantly associated with aging, including one latent variable (LV41) corresponding to striatal signal. To gain further insight into the genes contained in LV41, we performed k-means clustering on the training data to identify studies that respond strongly to LV41. We found that the variable was relevant to striatum and aging across the scientific literature. Finally, we built a Web server (http://mousiplier.greenelab.com/) for users to easily explore the learned latent variables. Taken together, this study defines mousiPLIER as a method to uncover meaningful biological processes in mouse brain transcriptomic studies.


Subject(s)
Brain , Animals , Mice , Brain/metabolism , Gene Expression Profiling , Aging/physiology , Unsupervised Machine Learning , Transcriptome , Astrocytes/metabolism , Microglia/metabolism , Machine Learning , Male , Mice, Inbred C57BL
4.
J Am Vet Med Assoc ; 262(5): 1-8, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38417257

ABSTRACT

OBJECTIVE: To compare pedigree documentation and genetic test results to evaluate whether user-provided photographs influence the breed ancestry predictions of direct-to-consumer (DTC) genetic tests for dogs. ANIMALS: 12 registered purebred pet dogs representing 12 different breeds. METHODS: Each dog owner submitted 6 buccal swabs, 1 to each of 6 DTC genetic testing companies. Experimenters registered each sample per manufacturer instructions. For half of the dogs, the registration included a photograph of the DNA donor. For the other half of the dogs, photographs were swapped between dogs. DNA analysis and breed ancestry prediction were conducted by each company. The effect of condition (ie, matching vs shuffled photograph) was evaluated for each company's breed predictions. As a positive control, a convolutional neural network was also used to predict breed based solely on the photograph. RESULTS: Results from 5 of the 6 tests always included the dog's registered breed. One test and the convolutional neural network were unlikely to identify the registered breed and frequently returned results that were more similar to the photograph than the DNA. Additionally, differences in the predictions made across all tests underscored the challenge of identifying breed ancestry, even in purebred dogs. CLINICAL RELEVANCE: Veterinarians are likely to encounter patients who have conducted DTC genetic testing and may be asked to explain the results of genetic tests they did not order. This systematic comparison of commercially available tests provides context for interpreting results from consumer-grade DTC genetic testing kits.

5.
Microbiol Spectr ; 12(4): e0315723, 2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38385740

ABSTRACT

Chronic Pseudomonas aeruginosa lung infections are a feature of cystic fibrosis (CF) that many patients experience even with the advent of highly effective modulator therapies. Identifying factors that impact P. aeruginosa in the CF lung could yield novel strategies to eradicate infection or otherwise improve outcomes. To complement published P. aeruginosa studies using laboratory models or RNA isolated from sputum, we analyzed transcripts of strain PAO1 after incubation in sputum from different CF donors prior to RNA extraction. We compared PAO1 gene expression in this "spike-in" sputum model to that for P. aeruginosa grown in synthetic cystic fibrosis sputum medium to determine key genes, which are among the most differentially expressed or most highly expressed. Using the key genes, gene sets with correlated expression were determined using the gene expression analysis tool eADAGE. Gene sets were used to analyze the activity of specific pathways in P. aeruginosa grown in sputum from different individuals. Gene sets that we found to be more active in sputum showed similar activation in published data that included P. aeruginosa RNA isolated from sputum relative to corresponding in vitro reference cultures. In the ex vivo samples, P. aeruginosa had increased levels of genes related to zinc and iron acquisition which were suppressed by metal amendment of sputum. We also found a significant correlation between expression of the H1-type VI secretion system and CFTR corrector use by the sputum donor. An ex vivo sputum model or synthetic sputum medium formulation that imposes metal restriction may enhance future CF-related studies.IMPORTANCEIdentifying the gene expression programs used by Pseudomonas aeruginosa to colonize the lungs of people with cystic fibrosis (CF) will illuminate new therapeutic strategies. To capture these transcriptional programs, we cultured the common P. aeruginosa laboratory strain PAO1 in expectorated sputum from CF patient donors. Through bioinformatic analysis, we defined sets of genes that are more transcriptionally active in real CF sputum compared to a synthetic cystic fibrosis sputum medium. Many of the most differentially active gene sets contained genes related to metal acquisition, suggesting that these gene sets play an active role in scavenging for metals in the CF lung environment which may be inadequately represented in some models. Future studies of P. aeruginosa transcript abundance in CF may benefit from the use of an expectorated sputum model or media supplemented with factors that induce metal restriction.


Subject(s)
Cystic Fibrosis , Pseudomonas Infections , Humans , Pseudomonas aeruginosa/metabolism , Sputum , Gene Expression Profiling , Metals , Culture Media/metabolism , RNA/metabolism
6.
Gigascience ; 132024 Jan 02.
Article in English | MEDLINE | ID: mdl-38323677

ABSTRACT

Important tasks in biomedical discovery such as predicting gene functions, gene-disease associations, and drug repurposing opportunities are often framed as network edge prediction. The number of edges connecting to a node, termed degree, can vary greatly across nodes in real biomedical networks, and the distribution of degrees varies between networks. If degree strongly influences edge prediction, then imbalance or bias in the distribution of degrees could lead to nonspecific or misleading predictions. We introduce a network permutation framework to quantify the effects of node degree on edge prediction. Our framework decomposes performance into the proportions attributable to degree and the network's specific connections using network permutation to generate features that depend only on degree. We discover that performance attributable to factors other than degree is often only a small portion of overall performance. Researchers seeking to predict new or missing edges in biological networks should use our permutation approach to obtain a baseline for performance that may be nonspecific because of degree. We released our methods as an open-source Python package (https://github.com/hetio/xswap/).


Subject(s)
Algorithms , Probability
7.
Bioinform Adv ; 4(1): vbae004, 2024.
Article in English | MEDLINE | ID: mdl-38282973

ABSTRACT

Motivation: Most models can be fit to data using various optimization approaches. While model choice is frequently reported in machine-learning-based research, optimizers are not often noted. We applied two different implementations of LASSO logistic regression implemented in Python's scikit-learn package, using two different optimization approaches (coordinate descent, implemented in the liblinear library, and stochastic gradient descent, or SGD), to predict mutation status and gene essentiality from gene expression across a variety of pan-cancer driver genes. For varying levels of regularization, we compared performance and model sparsity between optimizers. Results: After model selection and tuning, we found that liblinear and SGD tended to perform comparably. liblinear models required more extensive tuning of regularization strength, performing best for high model sparsities (more nonzero coefficients), but did not require selection of a learning rate parameter. SGD models required tuning of the learning rate to perform well, but generally performed more robustly across different model sparsities as regularization strength decreased. Given these tradeoffs, we believe that the choice of optimizers should be clearly reported as a part of the model selection and validation process, to allow readers and reviewers to better understand the context in which results have been generated. Availability and implementation: The code used to carry out the analyses in this study is available at https://github.com/greenelab/pancancer-evaluation/tree/master/01_stratified_classification. Performance/regularization strength curves for all genes in the Vogelstein et al. (2013) dataset are available at https://doi.org/10.6084/m9.figshare.22728644.

8.
Am J Hum Genet ; 111(1): 11-23, 2024 Jan 04.
Article in English | MEDLINE | ID: mdl-38181729

ABSTRACT

Precision medicine initiatives across the globe have led to a revolution of repositories linking large-scale genomic data with electronic health records, enabling genomic analyses across the entire phenome. Many of these initiatives focus solely on research insights, leading to limited direct benefit to patients. We describe the biobank at the Colorado Center for Personalized Medicine (CCPM Biobank) that was jointly developed by the University of Colorado Anschutz Medical Campus and UCHealth to serve as a unique, dual-purpose research and clinical resource accelerating personalized medicine. This living resource currently has more than 200,000 participants with ongoing recruitment. We highlight the clinical, laboratory, regulatory, and HIPAA-compliant informatics infrastructure along with our stakeholder engagement, consent, recontact, and participant engagement strategies. We characterize aspects of genetic and geographic diversity unique to the Rocky Mountain region, the primary catchment area for CCPM Biobank participants. We leverage linked health and demographic information of the CCPM Biobank participant population to demonstrate the utility of the CCPM Biobank to replicate complex trait associations in the first 33,674 genotyped individuals across multiple disease domains. Finally, we describe our current efforts toward return of clinical genetic test results, including high-impact pathogenic variants and pharmacogenetic information, and our broader goals as the CCPM Biobank continues to grow. Bringing clinical and research interests together fosters unique clinical and translational questions that can be addressed from the large EHR-linked CCPM Biobank resource within a HIPAA- and CLIA-certified environment.


Subject(s)
Learning Health System , Precision Medicine , Humans , Biological Specimen Banks , Colorado , Genomics
9.
bioRxiv ; 2024 Apr 04.
Article in English | MEDLINE | ID: mdl-37503097

ABSTRACT

While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI's performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution.

10.
bioRxiv ; 2023 Dec 02.
Article in English | MEDLINE | ID: mdl-37961178

ABSTRACT

Introduction: High-grade serous carcinoma (HGSC) gene expression subtypes are associated with differential survival. We characterized HGSC gene expression in Black individuals and considered whether gene expression differences by race may contribute to poorer HGSC survival among Black versus non-Hispanic White individuals. Methods: We included newly generated RNA-Seq data from Black and White individuals, and array-based genotyping data from four existing studies of White and Japanese individuals. We assigned subtypes using K-means clustering. Cluster- and dataset-specific gene expression patterns were summarized by moderated t-scores. We compared cluster-specific gene expression patterns across datasets by calculating the correlation between the summarized vectors of moderated t-scores. Following mapping to The Cancer Genome Atlas (TCGA)-derived HGSC subtypes, we used Cox proportional hazards models to estimate subtype-specific survival by dataset. Results: Cluster-specific gene expression was similar across gene expression platforms. Comparing the Black study population to the White and Japanese study populations, the immunoreactive subtype was more common (39% versus 23%-28%) and the differentiated subtype less common (7% versus 22%-31%). Patterns of subtype-specific survival were similar between the Black and White populations with RNA-Seq data; compared to mesenchymal cases, the risk of death was similar for proliferative and differentiated cases and suggestively lower for immunoreactive cases (Black population HR=0.79 [0.55, 1.13], White population HR=0.86 [0.62, 1.19]). Conclusions: A single, platform-agnostic pipeline can be used to assign HGSC gene expression subtypes. While the observed prevalence of HGSC subtypes varied by race, subtype-specific survival was similar.

11.
bioRxiv ; 2023 Oct 11.
Article in English | MEDLINE | ID: mdl-37873416

ABSTRACT

Understanding the factors that shape variation in the human microbiome is a major goal of research in biology. While other genomics fields have used large, pre-compiled compendia to extract systematic insights requiring otherwise impractical sample sizes, there has been no comparable resource for the 16S rRNA sequencing data commonly used to quantify microbiome composition. To help close this gap, we have assembled a set of 168,484 publicly available human gut microbiome samples, processed with a single pipeline and combined into the largest unified microbiome dataset to date. We use this resource, which is freely available at microbiomap.org, to shed light on global variation in the human gut microbiome. We find that Firmicutes, particularly Bacilli and Clostridia, are almost universally present in the human gut. At the same time, the relative abundance of the 65 most common microbial genera differ between at least two world regions. We also show that gut microbiomes in undersampled world regions, such as Central and Southern Asia, differ significantly from the more thoroughly characterized microbiomes of Europe and Northern America. Moreover, humans in these overlooked regions likely harbor hundreds of taxa that have not yet been discovered due to this undersampling, highlighting the need for diversity in microbiome studies. We anticipate that this new compendium can serve the community and enable advanced applied and methodological research.

12.
Genome Biol ; 24(1): 239, 2023 10 20.
Article in English | MEDLINE | ID: mdl-37864274

ABSTRACT

BACKGROUND: Single-cell gene expression profiling provides unique opportunities to understand tumor heterogeneity and the tumor microenvironment. Because of cost and feasibility, profiling bulk tumors remains the primary population-scale analytical strategy. Many algorithms can deconvolve these tumors using single-cell profiles to infer their composition. While experimental choices do not change the true underlying composition of the tumor, they can affect the measurements produced by the assay. RESULTS: We generated a dataset of high-grade serous ovarian tumors with paired expression profiles from using multiple strategies to examine the extent to which experimental factors impact the results of downstream tumor deconvolution methods. We find that pooling samples for single-cell sequencing and subsequent demultiplexing has a minimal effect. We identify dissociation-induced differences that affect cell composition, leading to changes that may compromise the assumptions underlying some deconvolution algorithms. We also observe differences across mRNA enrichment methods that introduce additional discrepancies between the two data types. We also find that experimental factors change cell composition estimates and that the impact differs by method. CONCLUSIONS: Previous benchmarks of deconvolution methods have largely ignored experimental factors. We find that methods vary in their robustness to experimental factors. We provide recommendations for methods developers seeking to produce the next generation of deconvolution approaches and for scientists designing experiments using deconvolution to study tumor heterogeneity.


Subject(s)
Gene Expression Profiling , Ovarian Neoplasms , Humans , Female , Gene Expression Profiling/methods , Algorithms , Sequence Analysis, RNA/methods , Ovarian Neoplasms/genetics , Transcriptome , Tumor Microenvironment
13.
Nat Commun ; 14(1): 5562, 2023 09 09.
Article in English | MEDLINE | ID: mdl-37689782

ABSTRACT

Genes act in concert with each other in specific contexts to perform their functions. Determining how these genes influence complex traits requires a mechanistic understanding of expression regulation across different conditions. It has been shown that this insight is critical for developing new therapies. Transcriptome-wide association studies have helped uncover the role of individual genes in disease-relevant mechanisms. However, modern models of the architecture of complex traits predict that gene-gene interactions play a crucial role in disease origin and progression. Here we introduce PhenoPLIER, a computational approach that maps gene-trait associations and pharmacological perturbation data into a common latent representation for a joint analysis. This representation is based on modules of genes with similar expression patterns across the same conditions. We observe that diseases are significantly associated with gene modules expressed in relevant cell types, and our approach is accurate in predicting known drug-disease pairs and inferring mechanisms of action. Furthermore, using a CRISPR screen to analyze lipid regulation, we find that functionally important players lack associations but are prioritized in trait-associated modules by PhenoPLIER. By incorporating groups of co-expressed genes, PhenoPLIER can contextualize genetic associations and reveal potential targets missed by single-gene strategies.


Subject(s)
Clustered Regularly Interspaced Short Palindromic Repeats , Epistasis, Genetic , Causality , Gene Regulatory Networks , Transcriptome
14.
bioRxiv ; 2023 Aug 21.
Article in English | MEDLINE | ID: mdl-37662412

ABSTRACT

Chronic Pseudomonas aeruginosa lung infections are a distinctive feature of cystic fibrosis (CF) pathology, that challenge adults with CF even with the advent of highly effective modulator therapies. Characterizing P. aeruginosa transcription in the CF lung and identifying factors that drive gene expression could yield novel strategies to eradicate infection or otherwise improve outcomes. To complement published P. aeruginosa gene expression studies in laboratory culture models designed to model the CF lung environment, we employed an ex vivo sputum model in which laboratory strain PAO1 was incubated in sputum from different CF donors. As part of the analysis, we compared PAO1 gene expression in this "spike-in" sputum model to that for P. aeruginosa grown in artificial sputum medium (ASM). Analyses focused on genes that were differentially expressed between sputum and ASM and genes that were most highly expressed in sputum. We present a new approach that used sets of genes with correlated expression, identified by the gene expression analysis tool eADAGE, to analyze the differential activity of pathways in P. aeruginosa grown in CF sputum from different individuals. A key characteristic of P. aeruginosa grown in expectorated CF sputum was related to zinc and iron acquisition, but this signal varied by donor sputum. In addition, a significant correlation between P. aeruginosa expression of the H1-type VI secretion system and corrector use by the sputum donor was observed. These methods may be broadly useful in looking for variable signals across clinical samples.

15.
bioRxiv ; 2023 Aug 15.
Article in English | MEDLINE | ID: mdl-37577575

ABSTRACT

High throughput gene expression profiling is a powerful approach to generate hypotheses on the underlying causes of biological function and disease. Yet this approach is limited by its ability to infer underlying biological pathways and burden of testing tens of thousands of individual genes. Machine learning models that incorporate prior biological knowledge are necessary to extract meaningful pathways and generate rational hypothesis from the vast amount of gene expression data generated to date. We adopted an unsupervised machine learning method, Pathway-level information extractor (PLIER), to train the first mouse PLIER model on 190,111 mouse brain RNA-sequencing samples, the greatest amount of training data ever used by PLIER. mousiPLER converted gene expression data into a latent variables that align to known pathway or cell maker gene sets, substantially reducing data dimensionality and improving interpretability. To determine the utility of mousiPLIER, we applied it to a mouse brain aging study of microglia and astrocyte transcriptomic profiling. We found a specific set of latent variables that are significantly associated with aging, including one latent variable (LV41) corresponding to striatal signal. We next performed k-means clustering on the training data to identify studies that respond strongly to LV41, finding that the variable is relevant to striatum and aging across the scientific literature. Finally, we built a web server (http://mousiplier.greenelab.com/) for users to easily explore the learned latent variables. Taken together this study provides proof of concept that mousiPLIER can uncover meaningful biological processes in mouse transcriptomic studies.

17.
BioData Min ; 16(1): 16, 2023 May 05.
Article in English | MEDLINE | ID: mdl-37147665

ABSTRACT

While we often think of words as having a fixed meaning that we use to describe a changing world, words are also dynamic and changing. Scientific research can also be remarkably fast-moving, with new concepts or approaches rapidly gaining mind share. We examined scientific writing, both preprint and pre-publication peer-reviewed text, to identify terms that have changed and examine their use. One particular challenge that we faced was that the shift from closed to open access publishing meant that the size of available corpora changed by over an order of magnitude in the last two decades. We developed an approach to evaluate semantic shift by accounting for both intra- and inter-year variability using multiple integrated models. This analysis revealed thousands of change points in both corpora, including for terms such as 'cas9', 'pandemic', and 'sars'. We found that the consistent change-points between pre-publication peer-reviewed and preprinted text are largely related to the COVID-19 pandemic. We also created a web app for exploration that allows users to investigate individual terms ( https://greenelab.github.io/word-lapse/ ). To our knowledge, our research is the first to examine semantic shift in biomedical preprints and pre-publication peer-reviewed text, and provides a foundation for future work to understand how terms acquire new meanings and how peer review affects this process.

18.
mSystems ; 8(2): e0092822, 2023 04 27.
Article in English | MEDLINE | ID: mdl-36861992

ABSTRACT

In the 21st century, several emergent viruses have posed a global threat. Each pathogen has emphasized the value of rapid and scalable vaccine development programs. The ongoing severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has made the importance of such efforts especially clear. New biotechnological advances in vaccinology allow for recent advances that provide only the nucleic acid building blocks of an antigen, eliminating many safety concerns. During the COVID-19 pandemic, these DNA and RNA vaccines have facilitated the development and deployment of vaccines at an unprecedented pace. This success was attributable at least in part to broader shifts in scientific research relative to prior epidemics: the genome of SARS-CoV-2 was available as early as January 2020, facilitating global efforts in the development of DNA and RNA vaccines within 2 weeks of the international community becoming aware of the new viral threat. Additionally, these technologies that were previously only theoretical are not only safe but also highly efficacious. Although historically a slow process, the rapid development of vaccines during the COVID-19 crisis reveals a major shift in vaccine technologies. Here, we provide historical context for the emergence of these paradigm-shifting vaccines. We describe several DNA and RNA vaccines in terms of their efficacy, safety, and approval status. We also discuss patterns in worldwide distribution. The advances made since early 2020 provide an exceptional illustration of how rapidly vaccine development technology has advanced in the last 2 decades in particular and suggest a new era in vaccines against emerging pathogens. IMPORTANCE The SARS-CoV-2 pandemic has caused untold damage globally, presenting unusual demands on but also unique opportunities for vaccine development. The development, production, and distribution of vaccines are imperative to saving lives, preventing severe illness, and reducing the economic and social burdens caused by the COVID-19 pandemic. Although vaccine technologies that provide the DNA or RNA sequence of an antigen had never previously been approved for use in humans, they have played a major role in the management of SARS-CoV-2. In this review, we discuss the history of these vaccines and how they have been applied to SARS-CoV-2. Additionally, given that the evolution of new SARS-CoV-2 variants continues to present a significant challenge in 2022, these vaccines remain an important and evolving tool in the biomedical response to the pandemic.


Subject(s)
COVID-19 , Viral Vaccines , Humans , COVID-19/epidemiology , SARS-CoV-2/genetics , COVID-19 Vaccines , Nucleic Acid-Based Vaccines , Pandemics/prevention & control , mRNA Vaccines
19.
mSystems ; 8(2): e0092722, 2023 04 27.
Article in English | MEDLINE | ID: mdl-36861991

ABSTRACT

Over the past 150 years, vaccines have revolutionized the relationship between people and disease. During the COVID-19 pandemic, technologies such as mRNA vaccines have received attention due to their novelty and successes. However, more traditional vaccine development platforms have also yielded important tools in the worldwide fight against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A variety of approaches have been used to develop COVID-19 vaccines that are now authorized for use in countries around the world. In this review, we highlight strategies that focus on the viral capsid and outwards, rather than on the nucleic acids inside. These approaches fall into two broad categories: whole-virus vaccines and subunit vaccines. Whole-virus vaccines use the virus itself, in either an inactivated or an attenuated state. Subunit vaccines contain instead an isolated, immunogenic component of the virus. Here, we highlight vaccine candidates that apply these approaches against SARS-CoV-2 in different ways. In a companion article (H. M. Rando, R. Lordan, L. Kolla, E. Sell, et al., mSystems 8:e00928-22, 2023, https://doi.org/10.1128/mSystems.00928-22), we review the more recent and novel development of nucleic acid-based vaccine technologies. We further consider the role that these COVID-19 vaccine development programs have played in prophylaxis at the global scale. Well-established vaccine technologies have proved especially important to making vaccines accessible in low- and middle-income countries. Vaccine development programs that use established platforms have been undertaken in a much wider range of countries than those using nucleic acid-based technologies, which have been led by wealthy Western countries. Therefore, these vaccine platforms, though less novel from a biotechnological standpoint, have proven to be extremely important to the management of SARS-CoV-2. IMPORTANCE The development, production, and distribution of vaccines is imperative to saving lives, preventing illness, and reducing the economic and social burdens caused by the COVID-19 pandemic. Vaccines that use cutting-edge biotechnology have played an important role in mitigating the effects of SARS-CoV-2. However, more traditional methods of vaccine development that were refined throughout the 20th century have been especially critical to increasing vaccine access worldwide. Effective deployment is necessary to reducing the susceptibility of the world's population, which is especially important in light of emerging variants. In this review, we discuss the safety, immunogenicity, and distribution of vaccines developed using established technologies. In a separate review, we describe the vaccines developed using nucleic acid-based vaccine platforms. From the current literature, it is clear that the well-established vaccine technologies are also highly effective against SARS-CoV-2 and are being used to address the challenges of COVID-19 globally, including in low- and middle-income countries. This worldwide approach is critical for reducing the devastating impact of SARS-CoV-2.


Subject(s)
COVID-19 , Viral Vaccines , Humans , SARS-CoV-2 , COVID-19/prevention & control , COVID-19 Vaccines , Pandemics/prevention & control , Vaccine Development , Vaccines, Subunit , Nucleic Acid-Based Vaccines
20.
PLoS Comput Biol ; 19(3): e1010984, 2023 03.
Article in English | MEDLINE | ID: mdl-36972227

ABSTRACT

Those building predictive models from transcriptomic data are faced with two conflicting perspectives. The first, based on the inherent high dimensionality of biological systems, supposes that complex non-linear models such as neural networks will better match complex biological systems. The second, imagining that complex systems will still be well predicted by simple dividing lines prefers linear models that are easier to interpret. We compare multi-layer neural networks and logistic regression across multiple prediction tasks on GTEx and Recount3 datasets and find evidence in favor of both possibilities. We verified the presence of non-linear signal when predicting tissue and metadata sex labels from expression data by removing the predictive linear signal with Limma, and showed the removal ablated the performance of linear methods but not non-linear ones. However, we also found that the presence of non-linear signal was not necessarily sufficient for neural networks to outperform logistic regression. Our results demonstrate that while multi-layer neural networks may be useful for making predictions from gene expression data, including a linear baseline model is critical because while biological systems are high-dimensional, effective dividing lines for predictive models may not be.


Subject(s)
Gene Expression , Nonlinear Dynamics , Gene Expression Profiling , Neural Networks, Computer , Linear Models
SELECTION OF CITATIONS
SEARCH DETAIL
...