Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38464212

ABSTRACT

Every protein progresses through a natural lifecycle from birth to maturation to death; this process is coordinated by the protein homeostasis system. Environmental or physiological conditions trigger pathways that maintain the homeostasis of the proteome. An open question is how these pathways are modulated to respond to the many stresses that an organism encounters during its lifetime. To address this question, we tested how the fitness landscape changes in response to environmental and genetic perturbations using directed and massively parallel transposon mutagenesis in Caulobacter crescentus. We developed a general computational pipeline for the analysis of gene-by-environment interactions in transposon mutagenesis experiments. This pipeline uses a combination of general linear models (GLMs), statistical knockoffs, and a nonparametric Bayesian statistical model to identify essential genetic network components that are shared across environmental perturbations. This analysis allows us to quantify the similarity of proteotoxic environmental perturbations from the perspective of the fitness landscape. We find that essential genes vary more by genetic background than by environmental conditions, with limited overlap among mutant strains targeting different facets of the protein homeostasis system. We also identified 146 unique fitness determinants across different strains, with 19 genes common to at least two strains, showing varying resilience to proteotoxic stresses. Experiments exposing cells to a combination of genetic perturbations and dual environmental stressors show that perturbations that are quantitatively dissimilar from the perspective of the fitness landscape are likely to have a synergistic effect on the growth defect.

2.
HGG Adv ; 5(1): 100252, 2024 Jan 11.
Article in English | MEDLINE | ID: mdl-37859345

ABSTRACT

Previous genome-wide association studies (GWASs) for adiponectin, a complex trait linked to type 2 diabetes and obesity, identified >20 associated loci. However, most loci were identified in populations of European ancestry, and many of the target genes underlying the associations remain unknown. We conducted a cross-ancestry adiponectin GWAS meta-analysis in ≤46,434 individuals from the Metabolic Syndrome in Men (METSIM) cohort and the ADIPOGen and AGEN consortiums. We combined study-specific association summary statistics using a fixed-effects, inverse variance-weighted approach. We identified 22 loci associated with adiponectin (p < 5×10-8), including 15 known and seven previously unreported loci. Among individuals of European ancestry, Genome-wide Complex Traits Analysis joint conditional analysis (GCTA-COJO) identified 14 additional distinct signals at the ADIPOQ, CDH13, HCAR1, and ZNF664 loci. Leveraging the cross-ancestry data, FINEMAP + SuSiE identified 45 causal variants (PP > 0.9), which also exhibited potential pleiotropy for cardiometabolic traits. To prioritize target genes at associated loci, we propose a combinatorial likelihood scoring formalism (Gene Priority Score [GPScore]) based on measures derived from 11 gene prioritization strategies and the physical distance to the transcription start site. With GPScore, we prioritize the 30 most probable target genes underlying the adiponectin-associated variants in the cross-ancestry analysis, including well-known causal genes (e.g., ADIPOQ, CDH13) and additional genes (e.g., CSF1, RGS17). Functional association networks revealed complex interactions of prioritized genes, their functionally connected genes, and their underlying pathways centered around insulin and adiponectin signaling, indicating an essential role in regulating energy balance in the body, inflammation, coagulation, fibrinolysis, insulin resistance, and diabetes. Overall, our analyses identify and characterize adiponectin association signals and inform experimental interrogation of target genes for adiponectin.


Subject(s)
Diabetes Mellitus, Type 2 , Metabolic Syndrome , Male , Humans , Adiponectin/genetics , Diabetes Mellitus, Type 2/genetics , Genome-Wide Association Study , Genetic Predisposition to Disease/genetics , Metabolic Syndrome/genetics
3.
PLoS Comput Biol ; 18(3): e1009273, 2022 03.
Article in English | MEDLINE | ID: mdl-35255084

ABSTRACT

The understanding of bacterial gene function has been greatly enhanced by recent advancements in the deep sequencing of microbial genomes. Transposon insertion sequencing methods combines next-generation sequencing techniques with transposon mutagenesis for the exploration of the essentiality of genes under different environmental conditions. We propose a model-based method that uses regularized negative binomial regression to estimate the change in transposon insertions attributable to gene-environment changes in this genetic interaction study without transformations or uniform normalization. An empirical Bayes model for estimating the local false discovery rate combines unique and total count information to test for genes that show a statistically significant change in transposon counts. When applied to RB-TnSeq (randomized barcode transposon sequencing) and Tn-seq (transposon sequencing) libraries made in strains of Caulobacter crescentus using both total and unique count data the model was able to identify a set of conditionally beneficial or conditionally detrimental genes for each target condition that shed light on their functions and roles during various stress conditions.


Subject(s)
DNA Transposable Elements , Genes, Essential , Bayes Theorem , DNA Transposable Elements/genetics , Genes, Essential/genetics , High-Throughput Nucleotide Sequencing/methods , Mutagenesis, Insertional
4.
J Pediatr Gastroenterol Nutr ; 74(5): e109-e114, 2022 05 01.
Article in English | MEDLINE | ID: mdl-35149653

ABSTRACT

OBJECTIVES: There is limited knowledge about the role of esophageal microbiome in pediatric esophageal eosinophilia (EE). We aimed to characterize the esophageal microbiome in pediatric patients with and without EE. METHODS: In the present prospective study, esophageal mucosal biopsies were obtained from 41 children. Of these, 22 had normal esophageal mucosal biopsies ("healthy"), 6 children had reflux esophagitis (RE), 4 had proton pump inhibitor (PPi)-responsive esophageal eosinophilia (PPi-REE), and 9 had eosinophilic esophagitis (EoE). The microbiome composition was analyzed using 16S rRNA gene sequencing. The age median (range) in years for the healthy, RE, PPi-REE, and EoE group were 10 (1.5-18), 6 (2-15), 6.5 (5-15), and 9 (1.5-17), respectively. RESULTS: The bacterial phylum Actinobacteria, Bacteroidetes, Firmicutes, Fusobacteria, and Proteobacteria were the most predominant. The Epsilonproteobacteria, Betaproteobacteria, Flavobacteria, Fusobacteria, and Sphingobacteria class were underrepresented across groups. The Vibrionales was predominant in healthy and EoE group but lower in RE and PPi-REE groups. The genus Streptococcus, Rahnella, and Leptotrichia explained 29.65% of the variation in the data with an additional 10.86% variation in the data was explained by Microbacterium, Prevotella, and Vibrio genus. The healthy group had a higher diversity and richness index compared to other groups, but this was not statistically different. CONCLUSIONS: The pediatric esophagus has an abundant and diverse microbiome, both in the healthy and diseased states. The healthy group had a higher, but not significantly different, diversity and richness index compared to other groups.


Subject(s)
Eosinophilic Esophagitis , Esophagitis, Peptic , Microbiota , Child , Enteritis , Eosinophilia , Eosinophilic Esophagitis/pathology , Gastritis , Humans , Prospective Studies , Proton Pump Inhibitors/therapeutic use , RNA, Ribosomal, 16S/genetics
5.
Ann Appl Stat ; 15(2): 925-951, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34262633

ABSTRACT

There are distinguishing features or "hallmarks" of cancer that are found across tumors, individuals, and types of cancer, and these hallmarks can be driven by specific genetic mutations. Yet, within a single tumor there is often extensive genetic heterogeneity as evidenced by single-cell and bulk DNA sequencing data. The goal of this work is to jointly infer the underlying genotypes of tumor subpopulations and the distribution of those subpopulations in individual tumors by integrating single-cell and bulk sequencing data. Understanding the genetic composition of the tumor at the time of treatment is important in the personalized design of targeted therapeutic combinations and monitoring for possible recurrence after treatment. We propose a hierarchical Dirichlet process mixture model that incorporates the correlation structure induced by a structured sampling arrangement and we show that this model improves the quality of inference. We develop a representation of the hierarchical Dirichlet process prior as a Gamma-Poisson hierarchy and we use this representation to derive a fast Gibbs sampling inference algorithm using the augment-and-marginalize method. Experiments with simulation data show that our model outperforms standard numerical and statistical methods for decomposing admixed count data. Analyses of real acute lymphoblastic leukemia cancer sequencing dataset shows that our model improves upon state-of-the-art bioinformatic methods. An interpretation of the results of our model on this real dataset reveals co-mutated loci across samples.

6.
BMC Bioinformatics ; 21(1): 215, 2020 May 26.
Article in English | MEDLINE | ID: mdl-32456609

ABSTRACT

BACKGROUND: Recently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequencing data from such a nested sampling arrangement with single-cell and bulk samples so that developers of analysis methods can assess accuracy and precision. RESULTS: We have developed a tool that simulates DNA sequencing data from hierarchically grouped (correlated) samples where each sample is designated bulk or single-cell. Our tool uses a simple configuration file to define the experimental arrangement and can be integrated into software pipelines for testing of variant callers or other genomic tools. CONCLUSIONS: The DNA sequencing data generated by our simulator is representative of real data and integrates seamlessly with standard downstream analysis tools.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Single-Cell Analysis/methods , Software , Humans
7.
BMC Med Genomics ; 12(1): 92, 2019 07 01.
Article in English | MEDLINE | ID: mdl-31262303

ABSTRACT

BACKGROUND: Patient-derived xenograft (PDX) models are in vivo models of human cancer that have been used for translational cancer research and therapy selection for individual patients. The Jackson Laboratory (JAX) PDX resource comprises 455 models originating from 34 different primary sites (as of 05/08/2019). The models undergo rigorous quality control and are genomically characterized to identify somatic mutations, copy number alterations, and transcriptional profiles. Bioinformatics workflows for analyzing genomic data obtained from human tumors engrafted in a mouse host (i.e., Patient-Derived Xenografts; PDXs) must address challenges such as discriminating between mouse and human sequence reads and accurately identifying somatic mutations and copy number alterations when paired non-tumor DNA from the patient is not available for comparison. RESULTS: We report here data analysis workflows and guidelines that address these challenges and achieve reliable identification of somatic mutations, copy number alterations, and transcriptomic profiles of tumors from PDX models that lack genomic data from paired non-tumor tissue for comparison. Our workflows incorporate commonly used software and public databases but are tailored to address the specific challenges of PDX genomics data analysis through parameter tuning and customized data filters and result in improved accuracy for the detection of somatic alterations in PDX models. We also report a gene expression-based classifier that can identify EBV-transformed tumors. We validated our analytical approaches using data simulations and demonstrated the overall concordance of the genomic properties of xenograft tumors with data from primary human tumors in The Cancer Genome Atlas (TCGA). CONCLUSIONS: The analysis workflows that we have developed to accurately predict somatic profiles of tumors from PDX models that lack normal tissue for comparison enable the identification of the key oncogenic genomic and expression signatures to support model selection and/or biomarker development in therapeutic studies. A reference implementation of our analysis recommendations is available at https://github.com/TheJacksonLaboratory/PDX-Analysis-Workflows .


Subject(s)
Cell Transformation, Neoplastic , Genomics/methods , Neoplasms/genetics , Neoplasms/pathology , Workflow , Animals , DNA Copy Number Variations , Gene Expression Profiling , Humans , Lymphoma/genetics , Lymphoma/pathology , Mice , Point Mutation , Polymorphism, Single Nucleotide
8.
G3 (Bethesda) ; 9(6): 1795-1805, 2019 06 05.
Article in English | MEDLINE | ID: mdl-30996023

ABSTRACT

Isogenic laboratory mouse strains enhance reproducibility because individual animals are genetically identical. For the most widely used isogenic strain, C57BL/6, there exists a wealth of genetic, phenotypic, and genomic data, including a high-quality reference genome (GRCm38.p6). Now 20 years after the first release of the mouse reference genome, C57BL/6J mice are at least 26 inbreeding generations removed from GRCm38 and the strain is now maintained with periodic reintroduction of cryorecovered mice derived from a single breeder pair, aptly named Adam and Eve. To provide an update to the mouse reference genome that more accurately represents the genome of today's C57BL/6J mice, we took advantage of long read, short read, and optical mapping technologies to generate a de novo assembly of the C57BL/6J Eve genome (B6Eve). Using these data, we have addressed recurring variants observed in previous mouse genomic studies. We have also identified structural variations, closed gaps in the mouse reference assembly, and revealed previously unannotated coding sequences. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and will inform a reference genome that is more representative of the C57BL/6J mice that are in use today.


Subject(s)
Genome , Genomics , Animals , Computational Biology/methods , Female , Genomics/methods , Inbreeding , Male , Mice , Mice, Inbred C57BL , Pedigree , Phenotype , Polymorphism, Single Nucleotide
9.
DNA Res ; 26(1): 37-44, 2019 Feb 01.
Article in English | MEDLINE | ID: mdl-30395234

ABSTRACT

The prevalence of chronic kidney disease (CKD) is rising worldwide and 10-15% of the global population currently suffers from CKD and its complications. Given the increasing prevalence of CKD there is an urgent need to find novel treatment options. The American black bear (Ursus americanus) copes with months of lowered kidney function and metabolism during hibernation without the devastating effects on metabolism and other consequences observed in humans. In a biomimetic approach to better understand kidney adaptations and physiology in hibernating black bears, we established a high-quality genome assembly. Subsequent RNA-Seq analysis of kidneys comparing gene expression profiles in black bears entering (late fall) and emerging (early spring) from hibernation identified 169 protein-coding genes that were differentially expressed. Of these, 101 genes were downregulated and 68 genes were upregulated after hibernation. Fold changes ranged from 1.8-fold downregulation (RTN4RL2) to 2.4-fold upregulation (CISH). Most notable was the upregulation of cytokine suppression genes (SOCS2, CISH, and SERPINC1) and the lack of increased expression of cytokines and genes involved in inflammation. The identification of these differences in gene expression in the black bear kidney may provide new insights in the prevention and treatment of CKD.


Subject(s)
Gene Expression Regulation , Genome , Hibernation/genetics , Ursidae/genetics , Animals , Female , Gene Expression Profiling , Male , Nogo Receptor 2/genetics , Seasons , Sequence Analysis, DNA , Sequence Analysis, RNA , Suppressor of Cytokine Signaling Proteins/genetics , Ursidae/physiology
10.
J Immunol ; 201(7): 1907-1917, 2018 10 01.
Article in English | MEDLINE | ID: mdl-30127089

ABSTRACT

In both NOD mice and humans, the development of type 1 diabetes (T1D) is dependent in part on autoreactive CD8+ T cells recognizing pancreatic ß cell peptides presented by often quite common MHC class I variants. Studies in NOD mice previously revealed that the common H2-Kd and/or H2-Db class I molecules expressed by this strain aberrantly lose the ability to mediate the thymic deletion of pathogenic CD8+ T cell responses through interactions with T1D susceptibility genes outside the MHC. A gene(s) mapping to proximal chromosome 7 was previously shown to be an important contributor to the failure of the common class I molecules expressed by NOD mice to mediate the normal thymic negative selection of diabetogenic CD8+ T cells. Using an inducible model of thymic negative selection and mRNA transcript analyses, we initially identified an elevated Nfkbid expression variant as a likely NOD-proximal chromosome 7 region gene contributing to impaired thymic deletion of diabetogenic CD8+ T cells. CRISPR/Cas9-mediated genetic attenuation of Nfkbid expression in NOD mice resulted in improved negative selection of autoreactive diabetogenic AI4 and NY8.3 CD8+ T cells. These results indicated that allelic variants of Nfkbid contribute to the efficiency of intrathymic deletion of diabetogenic CD8+ T cells. However, although enhancing thymic deletion of pathogenic CD8+ T cells, ablating Nfkbid expression surprisingly accelerated T1D onset that was associated with numeric decreases in both regulatory T and B lymphocytes in NOD mice.


Subject(s)
CD8-Positive T-Lymphocytes/immunology , Chromosomes, Human, Pair 7/genetics , Diabetes Mellitus, Type 1/immunology , I-kappa B Proteins/genetics , Thymus Gland/immunology , Alleles , Animals , Autoantigens/immunology , Cell Differentiation , Cells, Cultured , Clonal Deletion , Disease Models, Animal , Disease Susceptibility , Humans , I-kappa B Proteins/metabolism , Mice , Mice, Inbred NOD , Polymorphism, Genetic
11.
Genetics ; 206(2): 537-556, 2017 06.
Article in English | MEDLINE | ID: mdl-28592495

ABSTRACT

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.


Subject(s)
Genetic Drift , Genome/genetics , Mice, Inbred Strains/genetics , Quantitative Trait Loci/genetics , Animals , Chromosome Mapping , Crosses, Genetic , Genotype , Haplotypes , Male , Mice , Mutation , Polymorphism, Single Nucleotide
13.
J Appl Lab Med ; 2(2): 138-149, 2017 Sep 01.
Article in English | MEDLINE | ID: mdl-32630970

ABSTRACT

BACKGROUND: Next-generation sequencing (NGS) assays are highly complex tests that can vary substantially in both their design and intended application. Despite their innumerous advantages, NGS assays present some unique challenges associated with the preanalytical process, library preparation, data analysis, and reporting. According to a number of professional laboratory organization, control materials should be included both during the analytical validation phase and in routine clinical use to guarantee highly accurate results. The SeraseqTM Solid Tumor Mutation Mix AF10 and AF20 control materials consist of 26 biosynthetic DNA constructs in a genomic DNA background, each containing a specific variant or mutation of interest and an internal quality marker at 2 distinct allelic frequencies of 10% and 20%, respectively. The goal of this interlaboratory study was to evaluate the Seraseq AF10 and AF20 control materials by verifying their performance as control materials and by evaluating their ability to measure quality metrics essential to a clinical test. METHODS: Performance characteristics were assessed within and between 6 CLIA-accredited laboratories and 1 research laboratory. RESULTS: Most laboratories detected all 26 mutations of interest; however, some discrepancies involving the internal quality markers were observed. CONCLUSION: This interlaboratory study showed that the Seraseq AF10 and AF20 control materials have high quality, stability, and genomic complexity in variant types that are well suited for assisting in NGS assay analytical validation and monitoring routine clinical applications.

14.
Genome Announc ; 2(1)2014 Feb 13.
Article in English | MEDLINE | ID: mdl-24526640

ABSTRACT

We report the complete genome sequence of the Sungri/96 vaccine strain of peste des petits ruminants virus (PPRV). The whole-genome nucleotide sequence has 89 to 99% identity with the available PPRV genome sequences in the NCBI database. This study helps to understand the epidemiological and molecular characteristics of the Sungri/96 strain.

SELECTION OF CITATIONS
SEARCH DETAIL
...