Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Methods ; 21(3): 488-500, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38361019

RESUMO

Protein-protein interactions (PPIs) drive cellular processes and responses to environmental cues, reflecting the cellular state. Here we develop Tapioca, an ensemble machine learning framework for studying global PPIs in dynamic contexts. Tapioca predicts de novo interactions by integrating mass spectrometry interactome data from thermal/ion denaturation or cofractionation workflows with protein properties and tissue-specific functional networks. Focusing on the thermal proximity coaggregation method, we improved the experimental workflow. Finely tuned thermal denaturation afforded increased throughput, while cell lysis optimization enhanced protein detection from different subcellular compartments. The Tapioca workflow was next leveraged to investigate viral infection dynamics. Temporal PPIs were characterized during the reactivation from latency of the oncogenic Kaposi's sarcoma-associated herpesvirus. Together with functional assays, NUCKS was identified as a proviral hub protein, and a broader role was uncovered by integrating PPI networks from alpha- and betaherpesvirus infections. Altogether, Tapioca provides a web-accessible platform for predicting PPIs in dynamic contexts.


Assuntos
Herpesvirus Humano 8 , Manihot , Sarcoma de Kaposi , Sarcoma de Kaposi/metabolismo , Proteínas Virais/metabolismo , Manihot/metabolismo , Latência Viral , Herpesvirus Humano 8/metabolismo
2.
bioRxiv ; 2023 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-37961197

RESUMO

To facilitate single cell multi-omics analysis and improve reproducibility, we present SPEEDI (Single-cell Pipeline for End to End Data Integration), a fully automated end-to-end framework for batch inference, data integration, and cell type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/.

3.
Nat Comput Sci ; 3(7): 644-657, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37974651

RESUMO

Resolving chromatin-remodeling-linked gene expression changes at cell-type resolution is important for understanding disease states. Here we describe MAGICAL (Multiome Accessibility Gene Integration Calling and Looping), a hierarchical Bayesian approach that leverages paired single-cell RNA sequencing and single-cell transposase-accessible chromatin sequencing from different conditions to map disease-associated transcription factors, chromatin sites, and genes as regulatory circuits. By simultaneously modeling signal variation across cells and conditions in both omics data types, MAGICAL achieved high accuracy on circuit inference. We applied MAGICAL to study Staphylococcus aureus sepsis from peripheral blood mononuclear single-cell data that we generated from subjects with bloodstream infection and uninfected controls. MAGICAL identified sepsis-associated regulatory circuits predominantly in CD14 monocytes, known to be activated by bacterial sepsis. We addressed the challenging problem of distinguishing host regulatory circuit responses to methicillin-resistant and methicillin-susceptible S. aureus infections. Although differential expression analysis failed to show predictive value, MAGICAL identified epigenetic circuit biomarkers that distinguished methicillin-resistant from methicillin-susceptible S. aureus infections.

5.
Dev Cell ; 56(16): 2381-2398.e6, 2021 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-34428401

RESUMO

Congenital abnormalities of the kidney and urinary tract are among the most common birth defects, affecting 3% of newborns. The human kidney forms around a million nephrons from a pool of nephron progenitors over a 30-week period of development. To establish a framework for human nephrogenesis, we spatially resolved a stereotypical process by which equipotent nephron progenitors generate a nephron anlage, then applied data-driven approaches to construct three-dimensional protein maps on anatomical models of the nephrogenic program. Single-cell RNA sequencing identified progenitor states, which were spatially mapped to the nephron anatomy, enabling the generation of functional gene networks predicting interactions within and between nephron cell types. Network mining identified known developmental disease genes and predicted targets of interest. The spatially resolved nephrogenic program made available through the Human Nephrogenesis Atlas (https://sckidney.flatironinstitute.org/) will facilitate an understanding of kidney development and disease and enhance efforts to generate new kidney structures.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Néfrons/metabolismo , Transcriptoma , Animais , Humanos , Camundongos , Néfrons/citologia , Néfrons/embriologia , Proteoma/genética , Proteoma/metabolismo , RNA-Seq , Análise de Célula Única
6.
Genome Res ; 31(6): 1097-1105, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33888512

RESUMO

To enable large-scale analyses of transcription regulation in model species, we developed DeepArk, a set of deep learning models of the cis-regulatory activities for four widely studied species: Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, and Mus musculus DeepArk accurately predicts the presence of thousands of different context-specific regulatory features, including chromatin states, histone marks, and transcription factors. In vivo studies show that DeepArk can predict the regulatory impact of any genomic variant (including rare or not previously observed) and enables the regulatory annotation of understudied model species.


Assuntos
Aprendizado Profundo , Drosophila melanogaster , Animais , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Regulação da Expressão Gênica , Camundongos , Peixe-Zebra/genética
7.
Cell Syst ; 11(3): 215-228.e5, 2020 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-32916097

RESUMO

Precise discrimination of tumor from normal tissues remains a major roadblock for therapeutic efficacy of chimeric antigen receptor (CAR) T cells. Here, we perform a comprehensive in silico screen to identify multi-antigen signatures that improve tumor discrimination by CAR T cells engineered to integrate multiple antigen inputs via Boolean logic, e.g., AND and NOT. We screen >2.5 million dual antigens and ∼60 million triple antigens across 33 tumor types and 34 normal tissues. We find that dual antigens significantly outperform the best single clinically investigated CAR targets and confirm key predictions experimentally. Further, we identify antigen triplets that are predicted to show close to ideal tumor-versus-normal tissue discrimination for several tumor types. This work demonstrates the potential of 2- to 3-antigen Boolean logic gates for improving tumor discrimination by CAR T cell therapies. Our predictions are available on an interactive web server resource (antigen.princeton.edu).


Assuntos
Antígenos de Neoplasias/metabolismo , Imunoterapia Adotiva/métodos , Humanos
8.
Neuron ; 107(5): 821-835.e12, 2020 09 09.
Artigo em Inglês | MEDLINE | ID: mdl-32603655

RESUMO

A major obstacle to treating Alzheimer's disease (AD) is our lack of understanding of the molecular mechanisms underlying selective neuronal vulnerability, a key characteristic of the disease. Here, we present a framework integrating high-quality neuron-type-specific molecular profiles across the lifetime of the healthy mouse, which we generated using bacTRAP, with postmortem human functional genomics and quantitative genetics data. We demonstrate human-mouse conservation of cellular taxonomy at the molecular level for neurons vulnerable and resistant in AD, identify specific genes and pathways associated with AD neuropathology, and pinpoint a specific functional gene module underlying selective vulnerability, enriched in processes associated with axonal remodeling, and affected by amyloid accumulation and aging. We have made all cell-type-specific profiles and functional networks available at http://alz.princeton.edu. Overall, our study provides a molecular framework for understanding the complex interplay between Aß, aging, and neurodegeneration within the most vulnerable neurons in AD.


Assuntos
Doença de Alzheimer/patologia , Perfilação da Expressão Gênica/métodos , Aprendizado de Máquina , Neurônios/patologia , Transcriptoma , Envelhecimento/genética , Envelhecimento/patologia , Doença de Alzheimer/genética , Animais , Redes Reguladoras de Genes/fisiologia , Humanos , Camundongos
9.
PLoS Genet ; 15(9): e1008382, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31553718

RESUMO

Comprehensive information on the timing and location of gene expression is fundamental to our understanding of embryonic development and tissue formation. While high-throughput in situ hybridization projects provide invaluable information about developmental gene expression patterns for model organisms like Drosophila, the output of these experiments is primarily qualitative, and a high proportion of protein coding genes and most non-coding genes lack any annotation. Accurate data-centric predictions of spatio-temporal gene expression will therefore complement current in situ hybridization efforts. Here, we applied a machine learning approach by training models on all public gene expression and chromatin data, even from whole-organism experiments, to provide genome-wide, quantitative spatio-temporal predictions for all genes. We developed structured in silico nano-dissection, a computational approach that predicts gene expression in >200 tissue-developmental stages. The algorithm integrates expression signals from a compendium of 6,378 genome-wide expression and chromatin profiling experiments in a cell lineage-aware fashion. We systematically evaluated our performance via cross-validation and experimentally confirmed 22 new predictions for four different embryonic tissues. The model also predicts complex, multi-tissue expression and developmental regulation with high accuracy. We further show the potential of applying these genome-wide predictions to extract tissue specificity signals from non-tissue-dissected experiments, and to prioritize tissues and stages for disease modeling. This resource, together with the exploratory tools are freely available at our webserver http://find.princeton.edu, which provides a valuable tool for a range of applications, from predicting spatio-temporal expression patterns to recognizing tissue signatures from differential gene expression profiles.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica no Desenvolvimento/genética , Estudo de Associação Genômica Ampla/métodos , Algoritmos , Animais , Biologia Computacional/métodos , Simulação por Computador , Drosophila/genética , Desenvolvimento Embrionário/genética , Previsões/métodos , Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Genes Controladores do Desenvolvimento/genética , Aprendizado de Máquina , Análise Espaço-Temporal , Transcriptoma/genética
10.
Nat Methods ; 15(12): 1049-1052, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30478325

RESUMO

A key unmet challenge in interpreting omics experiments is inferring biological meaning in the context of public functional genomics data. We developed a computational framework, Your Evidence Tailored Integration (YETI; http://yeti.princeton.edu/ ), which creates specialized functional interaction maps from large public datasets relevant to an individual omics experiment. Using this tailored integration, we predicted and experimentally confirmed an unexpected divergence in viral replication after seasonal or pandemic human influenza virus infection.


Assuntos
Interpretação Estatística de Dados , Redes Reguladoras de Genes , Genômica/métodos , Influenza Humana/genética , Orthomyxoviridae/fisiologia , Proteínas Virais/genética , Replicação Viral , Algoritmos , Células Cultivadas , Conjuntos de Dados como Assunto , Células Dendríticas/citologia , Células Dendríticas/metabolismo , Humanos , Influenza Humana/metabolismo , Influenza Humana/virologia
11.
PLoS Genet ; 14(8): e1007559, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30096138

RESUMO

The biology and behavior of adults differ substantially from those of developing animals, and cell-specific information is critical for deciphering the biology of multicellular animals. Thus, adult tissue-specific transcriptomic data are critical for understanding molecular mechanisms that control their phenotypes. We used adult cell-specific isolation to identify the transcriptomes of C. elegans' four major tissues (or "tissue-ome"), identifying ubiquitously expressed and tissue-specific "enriched" genes. These data newly reveal the hypodermis' metabolic character, suggest potential worm-human tissue orthologies, and identify tissue-specific changes in the Insulin/IGF-1 signaling pathway. Tissue-specific alternative splicing analysis identified a large set of collagen isoforms. Finally, we developed a machine learning-based prediction tool for 76 sub-tissue cell types, which we used to predict cellular expression differences in IIS/FOXO signaling, stage-specific TGF-ß activity, and basal vs. memory-induced CREB transcription. Together, these data provide a rich resource for understanding the biology governing multicellular adult animals.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Perfilação da Expressão Gênica , Processamento Alternativo , Animais , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/metabolismo , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Regulação da Expressão Gênica , Biblioteca Gênica , Insulina/metabolismo , Fator de Crescimento Insulin-Like I/metabolismo , Modelos Moleculares , Fenótipo , Isoformas de Proteínas , Análise de Sequência de RNA , Transdução de Sinais , Fator de Crescimento Transformador beta/genética , Fator de Crescimento Transformador beta/metabolismo
12.
Nat Neurosci ; 19(11): 1454-1462, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27479844

RESUMO

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes-about 65 genes out of an estimated several hundred-are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence. Our approach was validated in a large independent case-control sequencing study. Leveraging these genome-wide predictions and the brain-specific network, we demonstrated that the large set of ASD genes converges on a smaller number of key pathways and developmental stages of the brain. Finally, we identified likely pathogenic genes within frequent autism-associated copy-number variants and proposed genes and pathways that are likely mediators of ASD across multiple copy-number variants. All predictions and functional insights are available at http://asd.princeton.edu.


Assuntos
Transtorno do Espectro Autista/genética , Variações do Número de Cópias de DNA/genética , Polimorfismo de Nucleotídeo Único/genética , Redes Reguladoras de Genes , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos
13.
Immunity ; 43(3): 605-14, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-26362267

RESUMO

Many functionally important interactions between genes and proteins involved in immunological diseases and processes are unknown. The exponential growth in public high-throughput data offers an opportunity to expand this knowledge. To unlock human-immunology-relevant insight contained in the global biomedical research effort, including all public high-throughput datasets, we performed immunological-pathway-focused Bayesian integration of a comprehensive, heterogeneous compendium comprising 38,088 genome-scale experiments. The distillation of this knowledge into immunological networks of functional relationships between molecular entities (ImmuNet), and tools to mine this resource, are accessible to the public at http://immunet.princeton.edu. The predictive capacity of ImmuNet, established by rigorous statistical validation, is easily accessed by experimentalists to generate data-driven hypotheses. We demonstrate the power of this approach through the identification of unique host-virus interaction responses, and we show how ImmuNet complements genetic studies by predicting disease-associated genes. ImmuNet should be widely beneficial for investigating the mechanisms of the human immune system and immunological diseases.


Assuntos
Biologia Computacional/métodos , Doenças do Sistema Imunitário/imunologia , Sistema Imunitário/imunologia , Mapeamento de Interação de Proteínas/métodos , Transdução de Sinais/imunologia , Algoritmos , Teorema de Bayes , Redes Reguladoras de Genes/genética , Redes Reguladoras de Genes/imunologia , Interações Hospedeiro-Patógeno/imunologia , Humanos , Sistema Imunitário/metabolismo , Doenças do Sistema Imunitário/genética , Internet , Mapas de Interação de Proteínas/genética , Mapas de Interação de Proteínas/imunologia , Reprodutibilidade dos Testes , Transdução de Sinais/genética , Máquina de Vetores de Suporte , Transcriptoma/genética , Transcriptoma/imunologia , Viroses/genética , Viroses/imunologia , Viroses/virologia
14.
Nucleic Acids Res ; 43(W1): W128-33, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25969450

RESUMO

IMP (Integrative Multi-species Prediction), originally released in 2012, is an interactive web server that enables molecular biologists to interpret experimental results and to generate hypotheses in the context of a large cross-organism compendium of functional predictions and networks. The system provides biologists with a framework to analyze their candidate gene sets in the context of functional networks, expanding or refining their sets using functional relationships predicted from integrated high-throughput data. IMP 2.0 integrates updated prior knowledge and data collections from the last three years in the seven supported organisms (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Caenorhabditis elegans, and Saccharomyces cerevisiae) and extends function prediction coverage to include human disease. IMP identifies homologs with conserved functional roles for disease knowledge transfer, allowing biologists to analyze disease contexts and predictions across all organisms. Additionally, IMP 2.0 implements a new flexible platform for experts to generate custom hypotheses about biological processes or diseases, making sophisticated data-driven methods easily accessible to researchers. IMP does not require any registration or installation and is freely available for use at http://imp.princeton.edu.


Assuntos
Redes Reguladoras de Genes , Software , Animais , Gráficos por Computador , Doença/genética , Genes , Genômica , Humanos , Internet , Camundongos , Mapeamento de Interação de Proteínas , Proteínas/fisiologia , Ratos , Integração de Sistemas
15.
Nat Methods ; 12(3): 211-4, 3 p following 214, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25581801

RESUMO

We present SEEK (search-based exploration of expression compendia; http://seek.princeton.edu/), a query-based search engine for very large transcriptomic data collections, including thousands of human data sets from many different microarray and high-throughput sequencing platforms. SEEK uses a query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify genes, pathways and processes co-regulated with the query. SEEK provides multigene query searching with iterative metadata-based search refinement and extensive visualization-based analysis options.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Ferramenta de Busca , Transcriptoma , Algoritmos , Bases de Dados Genéticas , Ontologia Genética , Proteínas Hedgehog/genética , Proteínas Hedgehog/metabolismo , Humanos , RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...