Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Nature ; 617(7960): 312-324, 2023 05.
Article in English | MEDLINE | ID: mdl-37165242

ABSTRACT

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.


Subject(s)
Genome, Human , Genomics , Humans , Diploidy , Genome, Human/genetics , Haplotypes/genetics , Sequence Analysis, DNA , Genomics/standards , Reference Standards , Cohort Studies , Alleles , Genetic Variation
2.
Nat Biotechnol ; 41(2): 232-238, 2023 02.
Article in English | MEDLINE | ID: mdl-36050551

ABSTRACT

Circular consensus sequencing with Pacific Biosciences (PacBio) technology generates long (10-25 kilobases), accurate 'HiFi' reads by combining serial observations of a DNA molecule into a consensus sequence. The standard approach to consensus generation, pbccs, uses a hidden Markov model. We introduce DeepConsensus, which uses an alignment-based loss to train a gap-aware transformer-encoder for sequence correction. Compared to pbccs, DeepConsensus reduces read errors by 42%. This increases the yield of PacBio HiFi reads at Q20 by 9%, at Q30 by 27% and at Q40 by 90%. With two SMRT Cells of HG003, reads from DeepConsensus improve hifiasm assembly contiguity (NG50 4.9 megabases (Mb) to 17.2 Mb), increase gene completeness (94% to 97%), reduce the false gene duplication rate (1.1% to 0.5%), improve assembly base accuracy (Q43 to Q45) and reduce variant-calling errors by 24%. DeepConsensus models could be trained to the general problem of analyzing the alignment of other types of sequences, such as unique molecular identifiers or genome assemblies.


Subject(s)
High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
3.
Bioinformatics ; 37(18): 3067-3069, 2021 09 29.
Article in English | MEDLINE | ID: mdl-33704425

ABSTRACT

SUMMARY: Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. AVAILABILITY AND IMPLEMENTATION: Python package freely available at http://uhlerlab.github.io/causaldag/dci. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Gene Regulatory Networks , Gene Expression Regulation
4.
Nat Commun ; 12(1): 1024, 2021 02 15.
Article in English | MEDLINE | ID: mdl-33589624

ABSTRACT

Given the severity of the SARS-CoV-2 pandemic, a major challenge is to rapidly repurpose existing approved drugs for clinical interventions. While a number of data-driven and experimental approaches have been suggested in the context of drug repurposing, a platform that systematically integrates available transcriptomic, proteomic and structural data is missing. More importantly, given that SARS-CoV-2 pathogenicity is highly age-dependent, it is critical to integrate aging signatures into drug discovery platforms. We here take advantage of large-scale transcriptional drug screens combined with RNA-seq data of the lung epithelium with SARS-CoV-2 infection as well as the aging lung. To identify robust druggable protein targets, we propose a principled causal framework that makes use of multiple data modalities. Our analysis highlights the importance of serine/threonine and tyrosine kinases as potential targets that intersect the SARS-CoV-2 and aging pathways. By integrating transcriptomic, proteomic and structural data that is available for many diseases, our drug discovery platform is broadly applicable. Rigorous in vitro experiments as well as clinical trials are needed to validate the identified candidate drugs.


Subject(s)
Aging/physiology , COVID-19 Drug Treatment , COVID-19/genetics , Drug Repositioning , A549 Cells , Algorithms , Angiotensin-Converting Enzyme 2/metabolism , Antiviral Agents/therapeutic use , COVID-19/metabolism , Drug Discovery , Gene Expression , Gene Regulatory Networks , Humans , Proteomics , SARS-CoV-2 , Transcriptome
5.
Nat Commun ; 12(1): 31, 2021 01 04.
Article in English | MEDLINE | ID: mdl-33397893

ABSTRACT

The development of single-cell methods for capturing different data modalities including imaging and sequencing has revolutionized our ability to identify heterogeneous cell states. Different data modalities provide different perspectives on a population of cells, and their integration is critical for studying cellular heterogeneity and its function. While various methods have been proposed to integrate different sequencing data modalities, coupling imaging and sequencing has been an open challenge. We here present an approach for integrating vastly different modalities by learning a probabilistic coupling between the different data modalities using autoencoders to map to a shared latent space. We validate this approach by integrating single-cell RNA-seq and chromatin images to identify distinct subpopulations of human naive CD4+ T-cells that are poised for activation. Collectively, our approach provides a framework to integrate and translate between data modalities that cannot yet be measured within the same cell for diverse applications in biomedical discovery.


Subject(s)
Algorithms , CD4-Positive T-Lymphocytes/immunology , Single-Cell Analysis , Cell Nucleus/metabolism , Chromatin/genetics , Gene Expression Profiling , Gene Expression Regulation , Humans , Principal Component Analysis , ROC Curve , Reproducibility of Results , Sequence Analysis, RNA
6.
Proc Natl Acad Sci U S A ; 114(52): 13714-13719, 2017 12 26.
Article in English | MEDLINE | ID: mdl-29229825

ABSTRACT

The 3D structure of the genome plays a key role in regulatory control of the cell. Experimental methods such as high-throughput chromosome conformation capture (Hi-C) have been developed to probe the 3D structure of the genome. However, it remains a challenge to deduce from these data chromosome regions that are colocalized and coregulated. Here, we present an integrative approach that leverages 1D functional genomic features (e.g., epigenetic marks) with 3D interactions from Hi-C data to identify functional interchromosomal interactions. We construct a weighted network with 250-kb genomic regions as nodes and Hi-C interactions as edges, where the edge weights are given by the correlation between 1D genomic features. Individual interacting clusters are determined using weighted correlation clustering on the network. We show that intermingling regions generally fall into either active or inactive clusters based on the enrichment for RNA polymerase II (RNAPII) and H3K9me3, respectively. We show that active clusters are hotspots for transcription factor binding sites. We also validate our predictions experimentally by 3D fluorescence in situ hybridization (FISH) experiments and show that active RNAPII is enriched in predicted active clusters. Our method provides a general quantitative framework that couples 1D genomic features with 3D interactions from Hi-C to probe the guiding principles that link the spatial organization of the genome with regulatory control.


Subject(s)
Chromosomes, Human , Sequence Analysis, DNA/methods , Transcription, Genetic/physiology , Animals , Chromosomes, Human/genetics , Chromosomes, Human/metabolism , Humans
7.
Int J Rheum Dis ; 20(5): 597-608, 2017 May.
Article in English | MEDLINE | ID: mdl-28464513

ABSTRACT

AIM: To detect faults in phagocytosis in peripheral blood cells of pregnant women with systemic lupus erythematosus (SLE) and in cord blood of their newborns. METHODS: Pregnant women fulfilled ≥ 4 American College of Rheumatology criteria for SLE and their newborns were recruited. Pregnant women without SLE and their newborns constituted controls. Phagocytosis and respiratory burst were measured using PHAGOTEST and BURSTTEST kits (Biotechnology GmbH, Germany) on FACSCalibur™ flow cytometer. Expression of CD11b was estimated with antibodies (BD Biosciences, San Jose, CA, USA). Mann-Whitney rank-sum test was used to compare SLE group and controls. RESULTS: Phagocytosis and respiratory burst were estimated in blood of 31 SLE women (29.5 ± 3.3 years) and in cord blood of 26 newborns. Controls were 21 health women (29.8 ± 2.8 years) and their 21 babies. Median reactive oxygen species (ROS) production was reduced in the SLE group versus controls (arbitrary units): women, 2315 versus 3316 (P = 0.034); babies, 1051 versus 1791 (P = 0.041), respectively. Proportion of ROS-producing granulocytes decreased in the SLE group: women, 72.5% versus 94.0% (P = 0.025); babies, 46.8% versus 90.7% (P = 0.008). Proportion of phagocytes which engulfed Escherichia coli and bacteria number per phagocyte also decreased in SLE women. Monocyte activity was suppressed in newborns from the SLE group (RLU): 224 versus 507 (P = 0.022). CD11b expression was reduced in SLE women (RLU): granulocytes, 588 versus 1448.5 (P < 0.001); monocytes, 1017 versus 1619 (P = 0.002). CONCLUSION: Pregnant SLE women have low ingesting capacity of phagocytes. Suppression of phagocytosis in their newborns is mainly due to reduced number of cells producing ROS.


Subject(s)
Fetal Blood/immunology , Lupus Erythematosus, Systemic/blood , Phagocytes/immunology , Phagocytosis , Pregnancy Complications/blood , Adult , Biomarkers/blood , CD11b Antigen/blood , Case-Control Studies , Escherichia coli/physiology , Female , Humans , Infant, Newborn , Lupus Erythematosus, Systemic/diagnosis , Lupus Erythematosus, Systemic/immunology , Phagocytes/microbiology , Pregnancy , Pregnancy Complications/diagnosis , Pregnancy Complications/immunology , Reactive Oxygen Species/blood , Respiratory Burst , Young Adult
8.
AAPS PharmSciTech ; 16(4): 811-23, 2015 Aug.
Article in English | MEDLINE | ID: mdl-25563817

ABSTRACT

The drug coating process for coated drug-eluting stents (DES) has been identified as a key source of inter- and intra-batch variability in drug elution rates. Quality-by-design (QbD) principles were applied to gain an understanding of the ultrasonic spray coating process of DES. Statistically based design of experiments (DOE) were used to understand the relationship between ultrasonic atomization spray coating parameters and dependent variables such as coating mass ratio, roughness, drug solid state composite microstructure, and elution kinetics. Defect-free DES coatings composed of 70% 85:15 poly(DL-lactide-co-glycolide) and 30% everolimus were fabricated with a constant coating mass. The drug elution profile was characterized by a mathematical model describing biphasic release kinetics. Model coefficients were analyzed as a DOE response. Changes in ultrasonic coating processing conditions resulted in substantial changes in roughness and elution kinetics. Based on the outcome from the DOE study, a design space was defined in terms of the critical coating process parameters resulting in optimum coating roughness and drug elution. This QbD methodology can be useful to enhance the quality of coated DES.


Subject(s)
Drug-Eluting Stents , Ultrasonics , Chromatography, High Pressure Liquid , Everolimus/chemistry , Everolimus/pharmacokinetics , Microscopy, Atomic Force , Microscopy, Electron, Scanning , Polyglactin 910 , Surface Properties
SELECTION OF CITATIONS
SEARCH DETAIL
...