Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Nucleic Acids Res ; 52(9): 5152-5165, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38647067

ABSTRACT

Structured noncoding RNAs (ncRNAs) contribute to many important cellular processes involving chemical catalysis, molecular recognition and gene regulation. Few ncRNA classes are broadly distributed among organisms from all three domains of life, but the list of rarer classes that exhibit surprisingly diverse functions is growing. We previously developed a computational pipeline that enables the near-comprehensive identification of structured ncRNAs expressed from individual bacterial genomes. The regions between protein coding genes are first sorted based on length and the fraction of guanosine and cytidine nucleotides. Long, GC-rich intergenic regions are then examined for sequence and structural similarity to other bacterial genomes. Herein, we describe the implementation of this pipeline on 50 bacterial genomes from varied phyla. More than 4700 candidate intergenic regions with the desired characteristics were identified, which yielded 44 novel riboswitch candidates and numerous other putative ncRNA motifs. Although experimental validation studies have yet to be conducted, this rate of riboswitch candidate discovery is consistent with predictions that many hundreds of novel riboswitch classes remain to be discovered among the bacterial species whose genomes have already been sequenced. Thus, many thousands of additional novel ncRNA classes likely remain to be discovered in the bacterial domain of life.


Subject(s)
Genome, Bacterial , RNA, Bacterial , RNA, Untranslated , DNA, Intergenic/genetics , Genome, Bacterial/genetics , Genomics/methods , Riboswitch/genetics , RNA, Bacterial/genetics , RNA, Bacterial/chemistry , RNA, Untranslated/genetics , RNA, Untranslated/classification , RNA, Untranslated/chemistry
2.
Microb Genom ; 9(5)2023 05.
Article in English | MEDLINE | ID: mdl-37233150

ABSTRACT

Computational methods can be used to identify putative structured noncoding RNAs (ncRNAs) in bacteria, which can then be validated using various biochemical and genetic approaches. In a search for ncRNAs in Corynebacterium pseudotuberculosis, we observed a conserved region called the ilvB-II motif located upstream of the ilvB gene that is also present in other members of this genus. This gene codes for an enzyme involved in the production of branched-chain amino acids (BCAAs). The ilvB gene in some bacteria is regulated by members of a ppGpp-sensing riboswitch class, but previous and current data suggest that the ilvB-II motif regulates expression by a transcription attenuation mechanism involving protein translation from an upstream open reading frame (uORF or leader peptide). All representatives of this RNA motif carry a start codon positioned in-frame with a nearby stop codon, and the peptides resulting from translation of this uORF are enriched for BCAAs, suggesting that expression of the ilvB gene in the host cells is controlled by attenuation. Furthermore, recently discovered RNA motifs also associated with ilvB genes in other bacterial species appear to carry distinct uORFs, suggesting that transcription attenuation by uORF translation is a common mechanism for regulating ilvB genes.


Subject(s)
Operon , Peptides , RNA, Messenger/genetics , Peptides/genetics , Corynebacterium/genetics
3.
Proc Natl Acad Sci U S A ; 117(9): 4701-4709, 2020 03 03.
Article in English | MEDLINE | ID: mdl-32079721

ABSTRACT

Proteins' interactions with ancient ligands may reveal how molecular recognition emerged and evolved. We explore how proteins recognize adenine: a planar rigid fragment found in the most common and ancient ligands. We have developed a computational pipeline that extracts protein-adenine complexes from the Protein Data Bank, structurally superimposes their adenine fragments, and detects the hydrogen bonds mediating the interaction. Our analysis extends the known motifs of protein-adenine interactions in the Watson-Crick edge of adenine and shows that all of adenine's edges may contribute to molecular recognition. We further show that, on the proteins' side, binding is often mediated by specific amino acid segments ("themes") that recur across different proteins, such that different proteins use the same themes when binding the same adenine-containing ligands. We identify numerous proteins that feature these themes and are thus likely to bind adenine-containing ligands. Our analysis suggests that adenine binding has emerged multiple times in evolution.


Subject(s)
Adenine/metabolism , Evolution, Molecular , Molecular Docking Simulation/methods , Protein Conformation , Adenine/chemistry , Binding Sites , Hydrogen Bonding , Protein Binding , Sequence Analysis, Protein/methods , Software
4.
Protein Sci ; 29(1): 258-267, 2020 01.
Article in English | MEDLINE | ID: mdl-31702846

ABSTRACT

Patterns observed by examining the evolutionary relationships among proteins of common origin can reveal the structural and functional importance of specific residue positions. In particular, amino acids that are highly conserved (i.e., their positions evolve at a slower rate than other positions) are particularly likely to be of biological importance, for example, for ligand binding. ConSurf is a bioinformatics tool for accurately estimating the evolutionary rate of each position in a protein family. Here we introduce a new release of ConSurf-DB, a database of precalculated ConSurf evolutionary conservation profiles for proteins of known structure. ConSurf-DB provides high-accuracy estimates of the evolutionary rates of the amino acids in each protein. A reliable estimate of a query protein's evolutionary rates depends on having a sufficiently large number of effective homologues (i.e., nonredundant yet sufficiently similar). With current sequence data, ConSurf-DB covers 82% of the PDB proteins. It will be updated on a regular basis to ensure that coverage remains high-and that it might even increase. Much effort was dedicated to improving the user experience. The repository is available at https://consurfdb.tau.ac.il/. BROADER AUDIENCE: By comparing a protein to other proteins of similar origin, it is possible to determine the extent to which each amino acid position in the protein evolved slowly or rapidly. A protein's evolutionary profile can provide valuable insights: For example, amino acid positions that are highly conserved (i.e., evolved slowly) are particularly likely to be of structural and/or functional importance, for example, for ligand binding and catalysis. We introduce here a new and improved version of ConSurf-DB, a continually updated database that provides precalculated evolutionary profiles of proteins with known structure.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Proteins/genetics , Amino Acid Sequence , Conserved Sequence , Databases, Protein , Evolution, Molecular , Protein Conformation
5.
Methods Mol Biol ; 1851: 233-249, 2019.
Article in English | MEDLINE | ID: mdl-30298400

ABSTRACT

Present-day protein space is the result of 3.7 billion years of evolution, constrained by the underlying physicochemical qualities of the proteins. It is difficult to differentiate between evolutionary traces and effects of physicochemical constraints. Nonetheless, as a rule of thumb, instances of structural reuse, or focusing on structural similarity, are likely attributable to physicochemical constraints, whereas sequence reuse, or focusing on sequence similarity, may be more indicative of evolutionary relationships. Both types of relationships have been studied and can provide meaningful insights to protein biophysics and evolution, which in turn can lead to better algorithms for protein search, annotation, and maybe even design.In broad strokes, studies of protein space vary in the entities they represent, the similarity measure comparing these entities, and the representation used. The entities can be, for example, protein chains, domains, supra-domains, or smaller protein sub-parts denoted themes. The measures of similarity between the entities can be based on sequence, structure, function, or any combination of these. The representation can be global, encompassing the whole space, or local, focusing on a particular region surrounding protein(s) of interest. Global representations include lists of grouped proteins, protein networks, and maps. Networks are the abstraction that is derived most directly from the similarity data: each node is the protein entity (e.g., a domain), and edges connect similar domains. Selecting the entities, the similarity measure, and the abstraction are three intertwined decisions: the similarity measures allow us to identify the entities, and the selection of entities influences what is a meaningful similarity measure. Similarly, we seek entities that are related to each other in a way, for which a simple representation describes their relationships succinctly and accurately. This chapter will cover studies that rely on different entities, similarity measures, and a range of representations to better understand protein structure space. Scholars may use publicly available navigators offering a global representation, and in particular the hierarchical classifications SCOP, CATH, and ECOD, or a local representation, which encompass structural alignment algorithms. Alternatively, scholars can configure their own navigator using existing tools. To demonstrate this DIY (do it yourself) approach for navigating in protein space, we investigate substrate-binding proteins. By presenting sequence similarities among this large and diverse protein family as a network, we can infer that one member (pdb ID 4ntl; of yet unknown function) may bind methionine and suggest a putative binding mechanism.


Subject(s)
Proteins/chemistry , Proteins/genetics , Algorithms , Cluster Analysis , Databases, Protein , Sequence Alignment , Sequence Analysis, Protein
6.
Structure ; 25(7): 988-996.e3, 2017 07 05.
Article in English | MEDLINE | ID: mdl-28578875

ABSTRACT

CueR (Cu export regulator) is a metalloregulator protein that "senses" Cu(I) ions with very high affinity, thereby stimulating DNA binding and the transcription activation of two other metalloregulator proteins. The crystal structures of CueR when unbound or bound to DNA and a metal ion are very similar to each other, and the role of CueR and Cu(I) in initiating the transcription has not been fully understood yet. Using double electron-electron resonance (DEER) measurements and structure modeling, we investigate the conformational changes that CueR undergoes upon binding Cu(I) and DNA in solution. We observe three distinct conformations, corresponding to apo-CueR, DNA-bound CueR in the absence of Cu(I) (the "repression" state), and CueR-Cu(I)-DNA (the "activation" state). We propose a detailed structural mechanism underlying CueR's regulation of the transcription process. The mechanism explicitly shows the dependence of CueR activity on copper, thereby revealing the important negative feedback mechanism essential for regulating the intracellular copper concentration.


Subject(s)
Bacterial Proteins/chemistry , DNA-Binding Proteins/chemistry , Transcriptional Activation , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Binding Sites , Copper/metabolism , DNA/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Feedback, Physiological , Protein Binding
7.
Structure ; 23(11): 2162-70, 2015 Nov 03.
Article in English | MEDLINE | ID: mdl-26455800

ABSTRACT

Protein function involves conformational changes, but often, for a given protein, only some of these conformations are known. The missing conformations could be predicted using the wealth of data in the PDB. Most PDB proteins have multiple structures, and proteins sharing one similar conformation often share others as well. The ConTemplate web server (http://bental.tau.ac.il/contemplate) exploits these observations to suggest conformations for a query protein with at least one known conformation (or model thereof). We demonstrate ConTemplate on a ribose-binding protein that undergoes significant conformational changes upon substrate binding. Querying ConTemplate with the ligand-free (or bound) structure of the protein produces the ligand-bound (or free) conformation with a root-mean-square deviation of 1.7 Å (or 2.2 Å); the models are derived from conformations of other sugar-binding proteins, sharing approximately 30% sequence identity with the query. The calculation also suggests intermediate conformations and a pathway between the bound and free conformations.


Subject(s)
Protein Conformation , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Escherichia coli Proteins/chemistry , Escherichia coli Proteins/metabolism , Molecular Sequence Data , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL
...