Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2023 Mar 07.
Article in English | MEDLINE | ID: mdl-36945530

ABSTRACT

A major goal of cancer biology is to understand the mechanisms underlying tumorigenesis driven by somatically acquired mutations. Existing computational approaches focus on either scoring the pathogenicity of mutations or characterizing their effects at specific scales. Here, we established a unified computational framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. The establishment of NetFlow3D hinges upon the Human Protein Structurome, a complete repository we first compiled that incorporates the 3D structures of every single protein as well as the binding interfaces for all known PPIs in humans. The vast majority of 3D structural information was resolved by recent deep learning algorithms. By applying NetFlow3D to 415,017 somatic protein-altering mutations in 5,950 TCGA tumors across 19 cancer types, we identified 1,656 intra- and 3,343 inter-protein 3D clusters of mutations throughout the Human Protein Structurome, of which ~50% would not have been found if using only experimentally-determined protein structures. These 3D clusters have converging effects on 377 cellular subnetworks. Compared to canonical PPI network analyses, NetFlow3D achieved a 5.5-fold higher statistical power for identifying significantly dysregulated subnetworks. The majority of identified subnetworks were previously obscured by the overwhelming background noise of non-clustered passenger mutations, including portions of non-canonical PRC1, mediator complex, MCM2-7 complex, neddylation of cullins, complement system, TRiC, etc. NetFlow3D and our pan-cancer results can be accessed from http://netflow3d.yulab.org. This work shows that mapping how individual mutations act across scales requires the integration of their local spatial organization on protein structures and their global topological organization in the PPI network.

2.
Nat Biotechnol ; 41(1): 128-139, 2023 01.
Article in English | MEDLINE | ID: mdl-36217030

ABSTRACT

Studying viral-host protein-protein interactions can facilitate the discovery of therapies for viral infection. We use high-throughput yeast two-hybrid experiments and mass spectrometry to generate a comprehensive SARS-CoV-2-human protein-protein interactome network consisting of 739 high-confidence binary and co-complex interactions, validating 218 known SARS-CoV-2 host factors and revealing 361 novel ones. Our results show the highest overlap of interaction partners between published datasets and of genes differentially expressed in samples from COVID-19 patients. We identify an interaction between the viral protein ORF3a and the human transcription factor ZNF579, illustrating a direct viral impact on host transcription. We perform network-based screens of >2,900 FDA-approved or investigational drugs and identify 23 with significant network proximity to SARS-CoV-2 host factors. One of these drugs, carvedilol, shows clinical benefits for COVID-19 patients in an electronic health records analysis and antiviral properties in a human lung cell line infected with SARS-CoV-2. Our study demonstrates the value of network systems biology to understand human-virus interactions and provides hits for further research on COVID-19 therapeutics.


Subject(s)
COVID-19 , Protein Interaction Mapping , Humans , Cell Line , Gene Expression Regulation , SARS-CoV-2/genetics , Viral Proteins/metabolism
3.
Res Sq ; 2022 Jun 07.
Article in English | MEDLINE | ID: mdl-35677070

ABSTRACT

Physical interactions between viral and host proteins are responsible for almost all aspects of the viral life cycle and the host's immune response. Studying viral-host protein-protein interactions is thus crucial for identifying strategies for treatment and prevention of viral infection. Here, we use high-throughput yeast two-hybrid and affinity purification followed by mass spectrometry to generate a comprehensive SARS-CoV-2-human protein-protein interactome network consisting of both binary and co-complex interactions. We report a total of 739 high-confidence interactions, showing the highest overlap of interaction partners among published datasets as well as the highest overlap with genes differentially expressed in samples (such as upper airway and bronchial epithelial cells) from patients with SARS-CoV-2 infection. Showcasing the utility of our network, we describe a novel interaction between the viral accessory protein ORF3a and the host zinc finger transcription factor ZNF579 to illustrate a SARS-CoV-2 factor mediating a direct impact on host transcription. Leveraging our interactome, we performed network-based drug screens for over 2,900 FDA-approved/investigational drugs and obtained a curated list of 23 drugs that had significant network proximities to SARS-CoV-2 host factors, one of which, carvedilol, showed promising antiviral properties. We performed electronic health record-based validation using two independent large-scale, longitudinal COVID-19 patient databases and found that carvedilol usage was associated with a significantly lowered probability (17%-20%, P < 0.001) of obtaining a SARS-CoV-2 positive test after adjusting various confounding factors. Carvedilol additionally showed anti-viral activity against SARS-CoV-2 in a human lung epithelial cell line [half maximal effective concentration (EC 50 ) value of 4.1 µM], suggesting a mechanism for its beneficial effect in COVID-19. Our study demonstrates the value of large-scale network systems biology approaches for extracting biological insight from complex biological processes.

4.
Curr Opin Struct Biol ; 73: 102329, 2022 04.
Article in English | MEDLINE | ID: mdl-35139457

ABSTRACT

Bolstered by recent methodological and hardware advances, deep learning has increasingly been applied to biological problems and structural proteomics. Such approaches have achieved remarkable improvements over traditional machine learning methods in tasks ranging from protein contact map prediction to protein folding, prediction of protein-protein interaction interfaces, and characterization of protein-drug binding pockets. In particular, emergence of ab initio protein structure prediction methods including AlphaFold2 has revolutionized protein structural modeling. From a protein function perspective, numerous deep learning methods have facilitated deconvolution of the exact amino acid residues and protein surface regions responsible for binding other proteins or small molecule drugs. In this review, we provide a comprehensive overview of recent deep learning methods applied in structural proteomics.


Subject(s)
Deep Learning , Proteome , Computational Biology/methods , Protein Conformation , Protein Folding
5.
Genome Res ; 32(1): 135-149, 2022 01.
Article in English | MEDLINE | ID: mdl-34963661

ABSTRACT

Rapid accumulation of cancer genomic data has led to the identification of an increasing number of mutational hotspots with uncharacterized significance. Here we present a biologically informed computational framework that characterizes the functional relevance of all 1107 published mutational hotspots identified in approximately 25,000 tumor samples across 41 cancer types in the context of a human 3D interactome network, in which the interface of each interaction is mapped at residue resolution. Hotspots reside in network hub proteins and are enriched on protein interaction interfaces, suggesting that alteration of specific protein-protein interactions is critical for the oncogenicity of many hotspot mutations. Our framework enables, for the first time, systematic identification of specific protein interactions affected by hotspot mutations at the full proteome scale. Furthermore, by constructing a hotspot-affected network that connects all hotspot-affected interactions throughout the whole-human interactome, we uncover genome-wide relationships among hotspots and implicate novel cancer proteins that do not harbor hotspot mutations themselves. Moreover, applying our network-based framework to specific cancer types identifies clinically significant hotspots that can be used for prognosis and therapy targets. Overall, we show that our framework bridges the gap between the statistical significance of mutational hotspots and their biological and clinical significance in human cancers.


Subject(s)
Neoplasms , Proteome , Genomics , Humans , Mutation , Neoplasms/genetics , Proteome/chemistry , Proteome/genetics
6.
Nat Methods ; 18(12): 1477-1488, 2021 12.
Article in English | MEDLINE | ID: mdl-34845387

ABSTRACT

Emergence of new viral agents is driven by evolution of interactions between viral proteins and host targets. For instance, increased infectivity of SARS-CoV-2 compared to SARS-CoV-1 arose in part through rapid evolution along the interface between the spike protein and its human receptor ACE2, leading to increased binding affinity. To facilitate broader exploration of how pathogen-host interactions might impact transmission and virulence in the ongoing COVID-19 pandemic, we performed state-of-the-art interface prediction followed by molecular docking to construct a three-dimensional structural interactome between SARS-CoV-2 and human. We additionally carried out downstream meta-analyses to investigate enrichment of sequence divergence between SARS-CoV-1 and SARS-CoV-2 or human population variants along viral-human protein-interaction interfaces, predict changes in binding affinity by these mutations/variants and further prioritize drug repurposing candidates predicted to competitively bind human targets. We believe this resource ( http://3D-SARS2.yulab.org ) will aid in development and testing of informed hypotheses for SARS-CoV-2 etiology and treatments.


Subject(s)
Angiotensin-Converting Enzyme 2/metabolism , COVID-19/virology , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , Spike Glycoprotein, Coronavirus/genetics , Virus Attachment , Biological Evolution , COVID-19/immunology , Genetic Variation , Humans , Models, Molecular , Molecular Structure , Protein Conformation , Spike Glycoprotein, Coronavirus/physiology
7.
Nat Methods ; 17(10): 985-988, 2020 10.
Article in English | MEDLINE | ID: mdl-32994567

ABSTRACT

Thorough quality assessment of novel interactions identified by proteome-wide cross-linking mass spectrometry (XL-MS) studies is critical. Almost all current XL-MS studies have validated cross-links against known three-dimensional structures of representative protein complexes. Here, we provide theoretical and experimental evidence demonstrating that this approach can drastically underestimate error rates for proteome-wide XL-MS datasets, and propose a comprehensive set of four data-quality metrics to address this issue.


Subject(s)
Mass Spectrometry/methods , Proteome , Proteomics/methods , Cross-Linking Reagents/chemistry , Databases, Protein , Humans , Protein Conformation , Reproducibility of Results
8.
Nat Genet ; 52(10): 1067-1075, 2020 10.
Article in English | MEDLINE | ID: mdl-32958950

ABSTRACT

Distal enhancers play pivotal roles in development and disease yet remain one of the least understood regulatory elements. We used massively parallel reporter assays to perform functional comparisons of two leading enhancer models and find that gene-distal transcription start sites are robust predictors of active enhancers with higher resolution than histone modifications. We show that active enhancer units are precisely delineated by active transcription start sites, validate that these boundaries are sufficient for capturing enhancer function, and confirm that core promoter sequences are necessary for this activity. We assay adjacent enhancers and find that their joint activity is often driven by the stronger unit within the cluster. Finally, we validate these results through functional dissection of a distal enhancer cluster using CRISPR-Cas9 deletions. In summary, definition of high-resolution enhancer boundaries enables deconvolution of complex regulatory loci into modular units.


Subject(s)
Enhancer Elements, Genetic/genetics , Histone Code/genetics , Transcription Initiation Site , Transcription, Genetic , Cell Line , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Humans , Promoter Regions, Genetic/genetics , Protein Processing, Post-Translational/genetics , Transcription Initiation, Genetic
10.
Nat Metab ; 2(8): 663-672, 2020 08.
Article in English | MEDLINE | ID: mdl-32719537

ABSTRACT

Ageing is the greatest risk factor for most common chronic human diseases, and it therefore is a logical target for developing interventions to prevent, mitigate or reverse multiple age-related morbidities. Over the past two decades, genetic and pharmacologic interventions targeting conserved pathways of growth and metabolism have consistently led to substantial extension of the lifespan and healthspan in model organisms as diverse as nematodes, flies and mice. Recent genetic analysis of long-lived individuals is revealing common and rare variants enriched in these same conserved pathways that significantly correlate with longevity. In this Perspective, we summarize recent insights into the genetics of extreme human longevity and propose the use of this rare phenotype to identify genetic variants as molecular targets for gaining insight into the physiology of healthy ageing and the development of new therapies to extend the human healthspan.


Subject(s)
Drug Discovery , Genetics , Healthy Aging/genetics , Longevity/genetics , Aging/genetics , Animals , Humans
11.
Proc Natl Acad Sci U S A ; 117(21): 11836-11842, 2020 05 26.
Article in English | MEDLINE | ID: mdl-32398372

ABSTRACT

Systematic mappings of protein interactome networks have provided invaluable functional information for numerous model organisms. Here we develop PCR-mediated Linkage of barcoded Adapters To nucleic acid Elements for sequencing (PLATE-seq) that serves as a general tool to rapidly sequence thousands of DNA elements. We validate its utility by generating the ORFeome for Oryza sativa covering 2,300 genes and constructing a high-quality protein-protein interactome map consisting of 322 interactions between 289 proteins, expanding the known interactions in rice by roughly 50%. Our work paves the way for high-throughput profiling of protein-protein interactions in a wide range of organisms.


Subject(s)
Open Reading Frames/genetics , Oryza/genetics , Protein Interaction Mapping/methods , Protein Interaction Maps/genetics , Sequence Analysis, DNA/methods , Computational Biology/methods , DNA, Plant/genetics , Databases, Genetic , Genome, Plant/genetics , High-Throughput Nucleotide Sequencing/methods
12.
Cell Syst ; 10(4): 333-350.e14, 2020 04 22.
Article in English | MEDLINE | ID: mdl-32325033

ABSTRACT

Connectivity webs mediate the unique biology of the mammalian brain. Yet, while cell circuit maps are increasingly available, knowledge of their underlying molecular networks remains limited. Here, we applied multi-dimensional biochemical fractionation with mass spectrometry and machine learning to survey endogenous macromolecules across the adult mouse brain. We defined a global "interactome" comprising over one thousand multi-protein complexes. These include hundreds of brain-selective assemblies that have distinct physical and functional attributes, show regional and cell-type specificity, and have links to core neurological processes and disorders. Using reciprocal pull-downs and a transgenic model, we validated a putative 28-member RNA-binding protein complex associated with amyotrophic lateral sclerosis, suggesting a coordinated function in alternative splicing in disease progression. This brain interaction map (BraInMap) resource facilitates mechanistic exploration of the unique molecular machinery driving core cellular processes of the central nervous system. It is publicly available and can be explored here https://www.bu.edu/dbin/cnsb/mousebrain/.


Subject(s)
Brain Mapping/methods , Brain/metabolism , Connectome/methods , Amyotrophic Lateral Sclerosis/metabolism , Animals , DNA-Binding Proteins/genetics , Machine Learning , Mammals/physiology , Mass Spectrometry/methods , Mice , Mutation/genetics
13.
Protein Sci ; 29(1): 298-305, 2020 01.
Article in English | MEDLINE | ID: mdl-31721338

ABSTRACT

Significant efforts have been devoted in the last decade to improving molecular docking techniques to predict both accurate binding poses and ranking affinities. Some shortcomings in the field are the limited number of standard methods for measuring docking success and the availability of widely accepted standard data sets for use as benchmarks in comparing different docking algorithms throughout the field. In order to address these issues, we have created a Cross-Docking Benchmark server. The server is a versatile cross-docking data set containing 4,399 protein-ligand complexes across 95 protein targets intended to serve as benchmark set and gold standard for state-of-the-art pose and ranking prediction in easy, medium, hard, or very hard docking targets. The benchmark along with a customizable cross-docking data set generation tool is available at http://disco.csb.pitt.edu. We further demonstrate the potential uses of the server in questions outside of basic benchmarking such as the selection of the ideal docking reference structure.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Proteins/metabolism , Algorithms , Benchmarking , Binding Sites , Drug Design , Ligands , Molecular Docking Simulation , Protein Binding , Protein Conformation , Web Browser
14.
Nat Commun ; 10(1): 4141, 2019 09 12.
Article in English | MEDLINE | ID: mdl-31515488

ABSTRACT

Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual's genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.


Subject(s)
Gene Frequency/genetics , Genetic Variation , Genetics, Population , Alleles , Animals , Base Sequence , Disease/genetics , Genetic Predisposition to Disease , Genome, Human , HEK293 Cells , Humans , Mice , Mutation, Missense/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Protein Binding/genetics
15.
Curr Opin Syst Biol ; 11: 107-116, 2018 Oct.
Article in English | MEDLINE | ID: mdl-31086831

ABSTRACT

Rapid advances in next-generation sequencing technology have resulted in an explosion of whole-exome/genome sequencing data, providing an unprecedented opportunity to identify disease- and trait-associated variants in humans on a large scale. To date, the long-standing paradigm has leveraged fitness-based approximations to translate this ever-expanding sequencing data into causal insights in disease. However, while this approach robustly identifies variants under evolutionary constraint, it fails to provide molecular insights. Moreover, complex disease phenomena often violate standard assumptions of a direct organismal phenotype to overall fitness effect relationship. Here we discuss the potential of a molecular phenotype-oriented paradigm to uniquely identify candidate disease-causing mutations from the human genetic background. By providing a direct connection between single nucleotide mutations and observable organismal and cellular phenotypes associated with disease, we suggest that molecular phenotypes can readily incorporate alongside established fitness-based methodologies to provide complementary insights to the functional impact of human mutations. Lastly, we discuss how integrated approaches between molecular phenotypes and fitness-based perspectives facilitate new insights into the molecular mechanisms underlying disease-associated mutations while also providing a platform for improved interpretation of epistasis in human disease.

SELECTION OF CITATIONS
SEARCH DETAIL
...