Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Cell Syst ; 14(3): 210-219.e7, 2023 03 15.
Article in English | MEDLINE | ID: mdl-36693377

ABSTRACT

Protein structure, function, and evolution depend on local and collective epistatic interactions between amino acids. A powerful approach to defining these interactions is to construct models of couplings between amino acids that reproduce the empirical statistics (frequencies and correlations) observed in sequences comprising a protein family. The top couplings are then interpreted. Here, we show that as currently implemented, this inference unequally represents epistatic interactions, a problem that fundamentally arises from limited sampling of sequences in the context of distinct scales at which epistasis occurs in proteins. We show that these issues explain the ability of current approaches to predict tertiary contacts between amino acids and the inability to obviously expose larger networks of functionally relevant, collectively evolving residues called sectors. This work provides a necessary foundation for more deeply understanding and improving evolution-based models of proteins.


Subject(s)
Amino Acids , Proteins , Proteins/metabolism
2.
Proc Natl Acad Sci U S A ; 117(33): 19879-19887, 2020 08 18.
Article in English | MEDLINE | ID: mdl-32747536

ABSTRACT

The ribosome translates the genetic code into proteins in all domains of life. Its size and complexity demand long-range interactions that regulate ribosome function. These interactions are largely unknown. Here, we apply a global coevolution method, statistical coupling analysis (SCA), to identify coevolving residue networks (sectors) within the 23S ribosomal RNA (rRNA) of the large ribosomal subunit. As in proteins, SCA reveals a hierarchical organization of evolutionary constraints with near-independent groups of nucleotides forming physically contiguous networks within the three-dimensional structure. Using a quantitative, continuous-culture-with-deep-sequencing assay, we confirm that the top two SCA-predicted sectors contribute to ribosome function. These sectors map to distinct ribosome activities, and their origins trace to phylogenetic divergences across all domains of life. These findings provide a foundation to map ribosome allostery, explore ribosome biogenesis, and engineer ribosomes for new functions. Despite differences in chemical structure, protein and RNA enzymes appear to share a common internal logic of interaction and assembly.


Subject(s)
Escherichia coli/genetics , RNA, Bacterial/chemistry , RNA, Ribosomal, 23S/chemistry , Ribosomes/genetics , Escherichia coli/chemistry , Escherichia coli/metabolism , Evolution, Molecular , Nucleic Acid Conformation , Phylogeny , RNA, Bacterial/genetics , RNA, Bacterial/metabolism , RNA, Ribosomal, 23S/genetics , RNA, Ribosomal, 23S/metabolism , Ribosomes/chemistry , Ribosomes/metabolism
3.
Science ; 369(6502): 440-445, 2020 07 24.
Article in English | MEDLINE | ID: mdl-32703877

ABSTRACT

The rational design of enzymes is an important goal for both fundamental and practical reasons. Here, we describe a process to learn the constraints for specifying proteins purely from evolutionary sequence data, design and build libraries of synthetic genes, and test them for activity in vivo using a quantitative complementation assay. For chorismate mutase, a key enzyme in the biosynthesis of aromatic amino acids, we demonstrate the design of natural-like catalytic function with substantial sequence diversity. Further optimization focuses the generative model toward function in a specific genomic context. The data show that sequence-based statistical models suffice to specify proteins and provide access to an enormous space of functional sequences. This result provides a foundation for a general process for evolution-based design of artificial proteins.


Subject(s)
Chorismate Mutase , Evolution, Molecular , Models, Genetic , Models, Statistical , Amino Acid Sequence , Chorismate Mutase/chemistry , Chorismate Mutase/genetics , Escherichia coli Proteins/chemistry , Escherichia coli Proteins/genetics
4.
Synth Biol (Oxf) ; 3(1): ysx008, 2018.
Article in English | MEDLINE | ID: mdl-32995509

ABSTRACT

The design and synthesis of novel genes and deoxyribonucleic acid (DNA) sequences is a central technique in synthetic biology. Current methods of high throughput gene synthesis use pooled oligonucleotides obtained from custom-designed DNA microarray chips, and rely on orthogonal (non-interacting) polymerase chain reaction primers to specifically de-multiplex, by amplification, the precise subset of oligonucleotides necessary to assemble a full length gene. The availability of a large validated set of mutually orthogonal primers is therefore a crucial reagent for high-throughput gene synthesis. Here, we present a set of 166 20-nucleotide primers that are experimentally verified to be non-interacting, capable of specifying 13 695 unique genes. These primers represent a valuable resource to the synthetic biology community for specifying genetic components that can be assembled through a scalable and modular architecture.

5.
Elife ; 52016 10 04.
Article in English | MEDLINE | ID: mdl-27700984

ABSTRACT

The sequence of events that initiates T cell signaling is dictated by the specificities and order of activation of the tyrosine kinases that signal downstream of the T cell receptor. Using a platform that combines exhaustive point-mutagenesis of peptide substrates, bacterial surface-display, cell sorting, and deep sequencing, we have defined the specificities of the first two kinases in this pathway, Lck and ZAP-70, for the T cell receptor ζ chain and the scaffold proteins LAT and SLP-76. We find that ZAP-70 selects its substrates by utilizing an electrostatic mechanism that excludes substrates with positively-charged residues and favors LAT and SLP-76 phosphosites that are surrounded by negatively-charged residues. This mechanism prevents ZAP-70 from phosphorylating its own activation loop, thereby enforcing its strict dependence on Lck for activation. The sequence features in ZAP-70, LAT, and SLP-76 that underlie electrostatic selectivity likely contribute to the specific response of T cells to foreign antigens.


Subject(s)
Lymphocyte Specific Protein Tyrosine Kinase p56(lck)/chemistry , Lymphocyte Specific Protein Tyrosine Kinase p56(lck)/metabolism , Receptors, Antigen, T-Cell/metabolism , Signal Transduction , Static Electricity , ZAP-70 Protein-Tyrosine Kinase/chemistry , ZAP-70 Protein-Tyrosine Kinase/metabolism , HEK293 Cells , Humans , Substrate Specificity
6.
Methods Enzymol ; 523: 213-35, 2013.
Article in English | MEDLINE | ID: mdl-23422432

ABSTRACT

Statistical analysis of protein sequences indicates an architecture for natural proteins in which amino acids are engaged in a sparse, hierarchical pattern of interactions in the tertiary structure. This architecture might be a key and distinguishing feature of evolved proteins-a design principle providing not only for foldability and high-performance function but also for robustness to perturbation and the capacity for rapid adaptation to new selection pressures. Here, we describe an approach for systematically testing this design principle for natural-like proteins by (1) computational design of synthetic sequences that gradually add or remove constraints along the hierarchy of interacting residues and (2) experimental testing of the designed sequences for folding and biochemical function. By this process, we hope to understand how the constraints on fold, function, and other aspects of fitness are organized within natural proteins, a first step in understanding the process of "design" by evolution.


Subject(s)
Proteins/chemistry , Amino Acid Sequence , Evolution, Molecular , Protein Folding
7.
Mol Syst Biol ; 6: 414, 2010 Sep 21.
Article in English | MEDLINE | ID: mdl-20865007

ABSTRACT

Allosteric coupling between protein domains is fundamental to many cellular processes. For example, Hsp70 molecular chaperones use ATP binding by their actin-like N-terminal ATPase domain to control substrate interactions in their C-terminal substrate-binding domain, a reaction that is critical for protein folding in cells. Here, we generalize the statistical coupling analysis to simultaneously evaluate co-evolution between protein residues and functional divergence between sequences in protein sub-families. Applying this method in the Hsp70/110 protein family, we identify a sparse but structurally contiguous group of co-evolving residues called a 'sector', which is an attribute of the allosteric Hsp70 sub-family that links the functional sites of the two domains across a specific interdomain interface. Mutagenesis of Escherichia coli DnaK supports the conclusion that this interdomain sector underlies the allosteric coupling in this protein family. The identification of the Hsp70 sector provides a basis for further experiments to understand the mechanism of allostery and introduces the idea that cooperativity between interacting proteins or protein domains can be mediated by shared sectors.


Subject(s)
HSP70 Heat-Shock Proteins/metabolism , Molecular Chaperones/chemistry , Adenosine Triphosphatases/chemistry , Adenosine Triphosphatases/metabolism , Allosteric Site , Bacterial Physiological Phenomena , Circular Dichroism , Escherichia coli/metabolism , Escherichia coli Proteins/metabolism , Heat-Shock Proteins/metabolism , Models, Statistical , Molecular Conformation , Mutagenesis , Protein Structure, Tertiary , Saccharomyces cerevisiae/metabolism
8.
Science ; 322(5900): 438-42, 2008 Oct 17.
Article in English | MEDLINE | ID: mdl-18927392

ABSTRACT

Statistical analyses of protein families reveal networks of coevolving amino acids that functionally link distantly positioned functional surfaces. Such linkages suggest a concept for engineering allosteric control into proteins: The intramolecular networks of two proteins could be joined across their surface sites such that the activity of one protein might control the activity of the other. We tested this idea by creating PAS-DHFR, a designed chimeric protein that connects a light-sensing signaling domain from a plant member of the Per/Arnt/Sim (PAS) family of proteins with Escherichia coli dihydrofolate reductase (DHFR). With no optimization, PAS-DHFR exhibited light-dependent catalytic activity that depended on the site of connection and on known signaling mechanisms in both proteins. PAS-DHFR serves as a proof of concept for engineering regulatory activities into proteins through interface design at conserved allosteric sites.


Subject(s)
Flavoproteins/chemistry , Protein Engineering , Recombinant Fusion Proteins/chemistry , Recombinant Fusion Proteins/metabolism , Tetrahydrofolate Dehydrogenase/chemistry , Allosteric Regulation , Allosteric Site , Binding Sites , Catalysis , Cryptochromes , Escherichia coli/enzymology , Flavoproteins/metabolism , Kinetics , Ligands , Light , Models, Molecular , NADP/metabolism , Protein Conformation , Protein Structure, Secondary , Protein Structure, Tertiary , Tetrahydrofolate Dehydrogenase/metabolism
9.
Nature ; 437(7058): 512-8, 2005 Sep 22.
Article in English | MEDLINE | ID: mdl-16177782

ABSTRACT

Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.


Subject(s)
Computational Biology , Evolution, Molecular , Protein Folding , Proteins/chemistry , Proteins/metabolism , Algorithms , Magnetic Resonance Spectroscopy , Models, Molecular , Protein Denaturation , Protein Structure, Tertiary , Sequence Alignment , Thermodynamics
10.
Nature ; 437(7058): 579-83, 2005 Sep 22.
Article in English | MEDLINE | ID: mdl-16177795

ABSTRACT

Protein sequences evolve through random mutagenesis with selection for optimal fitness. Cooperative folding into a stable tertiary structure is one aspect of fitness, but evolutionary selection ultimately operates on function, not on structure. In the accompanying paper, we proposed a model for the evolutionary constraint on a small protein interaction module (the WW domain) through application of the SCA, a statistical analysis of multiple sequence alignments. Construction of artificial protein sequences directed only by the SCA showed that the information extracted by this analysis is sufficient to engineer the WW fold at atomic resolution. Here, we demonstrate that these artificial WW sequences function like their natural counterparts, showing class-specific recognition of proline-containing target peptides. Consistent with SCA predictions, a distributed network of residues mediates functional specificity in WW domains. The ability to recapitulate natural-like function in designed sequences shows that a relatively small quantity of sequence information is sufficient to specify the global energetics of amino acid interactions.


Subject(s)
Peptide Fragments/chemistry , Peptide Fragments/metabolism , Protein Structure, Tertiary , Amino Acid Sequence , Binding Sites , Evolution, Molecular , Models, Molecular , Molecular Sequence Data , Peptide Fragments/genetics , Peptide Library , Proline/metabolism , Protein Binding , Protein Folding , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Sequence Alignment , Substrate Specificity , Thermodynamics
11.
J Biol Chem ; 278(44): 43755-63, 2003 Oct 31.
Article in English | MEDLINE | ID: mdl-12930840

ABSTRACT

Erythropoietin receptor (EpoR) activation is crucial for mature red blood cell production. The murine EpoR can also be activated by the envelope protein of the polycythemic (P) spleen focus forming virus (SFFV), gp55-P. Due to differences in the TM sequence, gp55 of the anemic (A) strain SFFV, gp55-A, cannot efficiently activate the EpoR. Using antibody-mediated immunofluorescence co-patching, we show that the majority of EpoR forms hetero-oligomers at the cell surface with gp55-P and, surprisingly, with gp55-A. The EpoR TM domain is targeted by gp55-P and -A, as only chimeric receptors containing EpoR TM sequences oligomerized with gp55 proteins. Both gp55-P and gp55-A are homodimers on the cell surface, as shown by co-patching. However, when the homomeric interactions of the isolated TM domains were assayed by TOXCAT bacterial reporter system, only the TM sequence of gp55-P was dimerized. Thus, homo-oligomerization of gp55 proteins is insufficient for full EpoR activation, and a correct conformation of the dimer in the TM region is required. This is supported by the failure of gp55-A-->P, a mutant protein whose TM domain can homo-oligomerize, to fully activate EpoR. As unliganded EpoR forms TM-dependent but inactive homodimers, we propose that the EpoR can be activated to different extents by homodimeric gp55 proteins, depending on the conformation of the gp55 protein dimer in the TM region.


Subject(s)
Cell Membrane/metabolism , Receptors, Erythropoietin/chemistry , Viral Envelope Proteins/chemistry , Amino Acid Sequence , Animals , Antibodies/chemistry , Blotting, Western , Cell Line , Dimerization , Epitopes , Genes, Reporter , Mice , Microscopy, Fluorescence , Molecular Sequence Data , Mutation , Plasmids/metabolism , Precipitin Tests , Protein Binding , Protein Conformation , Protein Structure, Tertiary , Rats , Sequence Homology, Amino Acid , Transfection , Viral Envelope Proteins/metabolism
12.
Curr Opin Struct Biol ; 12(4): 447-52, 2002 Aug.
Article in English | MEDLINE | ID: mdl-12163066

ABSTRACT

Predicting protein sequences that fold into specific native three-dimensional structures is a problem of great potential complexity. Although the complete solution is ultimately rooted in understanding the physical chemistry underlying the complex interactions between amino acid residues that determine protein stability, recent work shows that empirical information about these first principles is embedded in the statistics of protein sequence and structure databases. This review focuses on the use of 'knowledge-based' potentials derived from these databases in designing proteins. In addition, the data suggest how the study of these empirical potentials might impact our fundamental understanding of the energetic principles of protein structure.


Subject(s)
Artificial Intelligence , Databases, Protein , Protein Conformation , Protein Engineering/methods , Proteins/chemistry , Sequence Alignment/methods , Amino Acid Sequence , Amino Acids/chemistry , Conserved Sequence , Information Storage and Retrieval/methods , Molecular Sequence Data , Protein Folding , Sequence Analysis, Protein/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...