Search | VHL Regional Portal

De novo design of protein interactions with learned surface fingerprints.

Gainza, Pablo; Wehrle, Sarah; Van Hall-Beauvais, Alexandra; Marchand, Anthony; Scheck, Andreas; Harteveld, Zander; Buckley, Stephen; Ni, Dongchun; Tan, Shuguang; Sverrisson, Freyr; Goverde, Casper; Turelli, Priscilla; Raclot, Charlène; Teslenko, Alexandra; Pacesa, Martin; Rosset, Stéphane; Georgeon, Sandrine; Marsden, Jane; Petruzzella, Aaron; Liu, Kefang; Xu, Zepeng; Chai, Yan; Han, Pu; Gao, George F; Oricchio, Elisa; Fierz, Beat; Trono, Didier; Stahlberg, Henning; Bronstein, Michael; Correia, Bruno E.

Nature ; 617(7959): 176-184, 2023 05.

Article in English | MEDLINE | ID: mdl-37100904

ABSTRACT

Physical interactions between proteins are essential for most biological processes governing life1. However, the molecular determinants of such interactions have been challenging to understand, even as genomic, proteomic and structural data increase. This knowledge gap has been a major obstacle for the comprehensive understanding of cellular protein-protein interaction networks and for the de novo design of protein binders that are crucial for synthetic biology and translational applications2-9. Here we use a geometric deep-learning framework operating on protein surfaces that generates fingerprints to describe geometric and chemical features that are critical to drive protein-protein interactions10. We hypothesized that these fingerprints capture the key aspects of molecular recognition that represent a new paradigm in the computational design of novel protein interactions. As a proof of principle, we computationally designed several de novo protein binders to engage four protein targets: SARS-CoV-2 spike, PD-1, PD-L1 and CTLA-4. Several designs were experimentally optimized, whereas others were generated purely in silico, reaching nanomolar affinity with structural and mutational characterization showing highly accurate predictions. Overall, our surface-centric approach captures the physical and chemical determinants of molecular recognition, enabling an approach for the de novo design of protein interactions and, more broadly, of artificial proteins with function.

Subject(s)

Computer Simulation , Deep Learning , Protein Binding , Proteins , Humans , Proteins/chemistry , Proteins/metabolism , Proteomics , Protein Interaction Maps , Binding Sites , Synthetic Biology

A generic framework for hierarchical de novo protein design.

Harteveld, Zander; Bonet, Jaume; Rosset, Stéphane; Yang, Che; Sesterhenn, Fabian; Correia, Bruno E.

Proc Natl Acad Sci U S A ; 119(43): e2206111119, 2022 10 25.

Article in English | MEDLINE | ID: mdl-36252041

ABSTRACT

De novo protein design enables the exploration of novel sequences and structures absent from the natural protein universe. De novo design also stands as a stringent test for our understanding of the underlying physical principles of protein folding and may lead to the development of proteins with unmatched functional characteristics. The first fundamental challenge of de novo design is to devise "designable" structural templates leading to sequences that will adopt the predicted fold. Here, we built on the TopoBuilder (TB) de novo design method, to automatically assemble structural templates with native-like features starting from string descriptors that capture the overall topology of proteins. Our framework eliminates the dependency of hand-crafted and fold-specific rules through an iterative, data-driven approach that extracts geometrical parameters from structural tertiary motifs. We evaluated the TopoBuilder framework by designing sequences for a set of five protein folds and experimental characterization revealed that several sequences were folded and stable in solution. The TopoBuilder de novo design framework will be broadly useful to guide the generation of artificial proteins with customized geometries, enabling the exploration of the protein universe.

Subject(s)

Protein Folding , Proteins , Models, Molecular , Protein Engineering/methods , Proteins/chemistry

A Rosetta-based protein design protocol converging to natural sequences.

Sormani, Giulia; Harteveld, Zander; Rosset, Stéphane; Correia, Bruno; Laio, Alessandro.

J Chem Phys ; 154(7): 074114, 2021 Feb 21.

Article in English | MEDLINE | ID: mdl-33607903

ABSTRACT

Computational protein design has emerged as a powerful tool capable of identifying sequences compatible with pre-defined protein structures. The sequence design protocols, implemented in the Rosetta suite, have become widely used in the protein engineering community. To understand the strengths and limitations of the Rosetta design framework, we tested several design protocols on two distinct folds (SH3-1 and Ubiquitin). The sequence optimization, when started from native structures and natural sequences or polyvaline sequences, converges to sequences that are not recognized as belonging to the fold family of the target protein by standard bioinformatic tools, such as BLAST and Hmmer. The sequences generated from both starting conditions (native and polyvaline) are instead very similar to each other and recognized by Hmmer as belonging to the same "family." This demonstrates the capability of Rosetta to converge to similar sequences, even when sampling from distinct starting conditions, but, on the other hand, shows intrinsic inaccuracy of the scoring function that drifts toward sequences that lack identifiable natural sequence signatures. To address this problem, we developed a protocol embedding Rosetta Design simulations in a genetic algorithm, in which the sequence search is biased to converge to sequences that exist in nature. This protocol allows us to obtain sequences that have recognizable natural sequence signatures and, experimentally, the designed proteins are biochemically well behaved and thermodynamically stable.

Subject(s)

Drug Design , Proteins/chemistry , Amino Acid Sequence , Models, Molecular , Protein Conformation , Protein Folding , Thermodynamics

Optogenetic control of Neisseria meningitidis Cas9 genome editing using an engineered, light-switchable anti-CRISPR protein.

Hoffmann, Mareike D; Mathony, Jan; Upmeier Zu Belzen, Julius; Harteveld, Zander; Aschenbrenner, Sabine; Stengl, Christina; Grimm, Dirk; Correia, Bruno E; Eils, Roland; Niopek, Dominik.

Nucleic Acids Res ; 49(5): e29, 2021 03 18.

Article in English | MEDLINE | ID: mdl-33330940

ABSTRACT

Optogenetic control of CRISPR-Cas9 systems has significantly improved our ability to perform genome perturbations in living cells with high precision in time and space. As new Cas orthologues with advantageous properties are rapidly being discovered and engineered, the need for straightforward strategies to control their activity via exogenous stimuli persists. The Cas9 from Neisseria meningitidis (Nme) is a particularly small and target-specific Cas9 orthologue, and thus of high interest for in vivo genome editing applications. Here, we report the first optogenetic tool to control NmeCas9 activity in mammalian cells via an engineered, light-dependent anti-CRISPR (Acr) protein. Building on our previous Acr engineering work, we created hybrids between the NmeCas9 inhibitor AcrIIC3 and the LOV2 blue light sensory domain from Avena sativa. Two AcrIIC3-LOV2 hybrids from our collection potently blocked NmeCas9 activity in the dark, while permitting robust genome editing at various endogenous loci upon blue light irradiation. Structural analysis revealed that, within these hybrids, the LOV2 domain is located in striking proximity to the Cas9 binding surface. Together, our work demonstrates optogenetic regulation of a type II-C CRISPR effector and might suggest a new route for the design of optogenetic Acrs.

Subject(s)

CRISPR-Associated Protein 9/antagonists & inhibitors , CRISPR-Associated Protein 9/chemistry , CRISPR-Cas Systems , Gene Editing/methods , Neisseria meningitidis/enzymology , Optogenetics/methods , Cell Line , HEK293 Cells , Humans , Light , Models, Molecular , Protein Engineering , Proteins/chemistry , Proteins/radiation effects

Computational design of anti-CRISPR proteins with improved inhibition potency.

Mathony, Jan; Harteveld, Zander; Schmelas, Carolin; Upmeier Zu Belzen, Julius; Aschenbrenner, Sabine; Sun, Wei; Hoffmann, Mareike D; Stengl, Christina; Scheck, Andreas; Georgeon, Sandrine; Rosset, Stéphane; Wang, Yanli; Grimm, Dirk; Eils, Roland; Correia, Bruno E; Niopek, Dominik.

Nat Chem Biol ; 16(7): 725-730, 2020 07.

Article in English | MEDLINE | ID: mdl-32284602

ABSTRACT

Anti-CRISPR (Acr) proteins are powerful tools to control CRISPR-Cas technologies. However, the available Acr repertoire is limited to naturally occurring variants. Here, we applied structure-based design on AcrIIC1, a broad-spectrum CRISPR-Cas9 inhibitor, to improve its efficacy on different targets. We first show that inserting exogenous protein domains into a selected AcrIIC1 surface site dramatically enhances inhibition of Neisseria meningitidis (Nme)Cas9. Then, applying structure-guided design to the Cas9-binding surface, we converted AcrIIC1 into AcrIIC1X, a potent inhibitor of the Staphylococcus aureus (Sau)Cas9, an orthologue widely applied for in vivo genome editing. Finally, to demonstrate the utility of AcrIIC1X for genome engineering applications, we implemented a hepatocyte-specific SauCas9 ON-switch by placing AcrIIC1X expression under regulation of microRNA-122. Our work introduces designer Acrs as important biotechnological tools and provides an innovative strategy to safeguard CRISPR technologies.

Subject(s)

CRISPR-Associated Protein 9/genetics , CRISPR-Cas Systems , Clustered Regularly Interspaced Short Palindromic Repeats , Gene Editing/methods , MicroRNAs/genetics , Protein Engineering/methods , Amino Acid Sequence , CRISPR-Associated Protein 9/metabolism , Cell Line, Tumor , Genome, Human , HEK293 Cells , Hepatocytes/cytology , Hepatocytes/metabolism , Humans , MicroRNAs/metabolism , Models, Molecular , Mutagenesis, Insertional , Neisseria meningitidis/enzymology , Neisseria meningitidis/genetics , Plasmids/chemistry , Plasmids/metabolism , Protein Domains , Protein Structure, Secondary , RNA, Guide, Kinetoplastida/genetics , RNA, Guide, Kinetoplastida/metabolism , Staphylococcus aureus/enzymology , Staphylococcus aureus/genetics

rstoolbox - a Python library for large-scale analysis of computational protein design data and structural bioinformatics.

Bonet, Jaume; Harteveld, Zander; Sesterhenn, Fabian; Scheck, Andreas; Correia, Bruno E.

BMC Bioinformatics ; 20(1): 240, 2019 May 15.

Article in English | MEDLINE | ID: mdl-31092198

ABSTRACT

BACKGROUND: Large-scale datasets of protein structures and sequences are becoming ubiquitous in many domains of biological research. Experimental approaches and computational modelling methods are generating biological data at an unprecedented rate. The detailed analysis of structure-sequence relationships is critical to unveil governing principles of protein folding, stability and function. Computational protein design (CPD) has emerged as an important structure-based approach to engineer proteins for novel functions. Generally, CPD workflows rely on the generation of large numbers of structural models to search for the optimal structure-sequence configurations. As such, an important step of the CPD process is the selection of a small subset of sequences to be experimentally characterized. Given the limitations of current CPD scoring functions, multi-step design protocols and elaborated analysis of the decoy populations have become essential for the selection of sequences for experimental characterization and the success of CPD strategies. RESULTS: Here, we present the rstoolbox, a Python library for the analysis of large-scale structural data tailored for CPD applications. rstoolbox is oriented towards both CPD software users and developers, being easily integrated in analysis workflows. For users, it offers the ability to profile and select decoy sets, which may guide multi-step design protocols or for follow-up experimental characterization. rstoolbox provides intuitive solutions for the visualization of large sequence/structure datasets (e.g. logo plots and heatmaps) and facilitates the analysis of experimental data obtained through traditional biochemical techniques (e.g. circular dichroism and surface plasmon resonance) and high-throughput sequencing. For CPD software developers, it provides a framework to easily benchmark and compare different CPD approaches. Here, we showcase the rstoolbox in both types of applications. CONCLUSIONS: rstoolbox is a library for the evaluation of protein structures datasets tailored for CPD data. It provides interactive access through seamless integration with IPython, while still being suitable for high-performance computing. In addition to its functionalities for data analysis and graphical representation, the inclusion of rstoolbox in protein design pipelines will allow to easily standardize the selection of design candidates, as well as, to improve the overall reproducibility and robustness of CPD selection processes.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Software , Amino Acid Sequence , Computing Methodologies , Reproducibility of Results

Engineered anti-CRISPR proteins for optogenetic control of CRISPR-Cas9.

Bubeck, Felix; Hoffmann, Mareike D; Harteveld, Zander; Aschenbrenner, Sabine; Bietz, Andreas; Waldhauer, Max C; Börner, Kathleen; Fakhiri, Julia; Schmelas, Carolin; Dietz, Laura; Grimm, Dirk; Correia, Bruno E; Eils, Roland; Niopek, Dominik.

Nat Methods ; 15(11): 924-927, 2018 11.

Article in English | MEDLINE | ID: mdl-30377362

ABSTRACT

Anti-CRISPR proteins are powerful tools for CRISPR-Cas9 regulation; the ability to precisely modulate their activity could facilitate spatiotemporally confined genome perturbations and uncover fundamental aspects of CRISPR biology. We engineered optogenetic anti-CRISPR variants comprising hybrids of AcrIIA4, a potent Streptococcus pyogenes Cas9 inhibitor, and the LOV2 photosensor from Avena sativa. Coexpression of these proteins with CRISPR-Cas9 effectors enabled light-mediated genome and epigenome editing, and revealed rapid Cas9 genome targeting in human cells.

Subject(s)

Biosensing Techniques , CRISPR-Associated Proteins/antagonists & inhibitors , CRISPR-Cas Systems , Gene Editing , Optogenetics , Phototropins/chemistry , Protein Engineering , Epigenomics , Genome , HEK293 Cells , Humans , Light , Streptococcus pyogenes/enzymology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL