Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Chembiochem ; : e202400202, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38818670

RESUMO

RNA labeling is an invaluable tool for investigation of the function and localization of nucleic acids. Labels are commonly incorporated into 3' end of RNA and the primary enzyme used for this purpose is RNA poly(A) polymerase (PAP), which belongs to the class of terminal nucleotidyltransferases (NTases). However, PAP preferentially adds ATP analogs, thus limiting the number of available substrates. Here, we report the use of another NTase, CutA from the fungus Thielavia terrestris. Using this enzyme, we were able to incorporate into the 3' end of RNA not only purine analogs, but also pyrimidine analogs. We engaged strain-promoted azide-alkyl cycloaddition (SPAAC) to obtain fluorescently labeled or biotinylated transcripts from RNAs extended with azide analogs by CutA. Importantly, modified transcripts retained their biological properties. Furthermore, fluorescently labeled mRNAs were suitable for visualization in cultured mammalian cells. Finally, we demonstrate that either affinity studies or molecular dynamic (MD) simulations allow for rapid screening of NTase substrates, what opens up new avenues in the search for the optimal substrates for this class of enzymes.

2.
Protein Sci ; 33(1): e4846, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38010737

RESUMO

In this study, we present a conformational landscape of 5000 AlphaFold2 models of the Histidine kinases, Adenyl cyclases, Methyl-accepting proteins and Phosphatases (HAMP) domain, a short helical bundle that transduces signals from sensors to effectors in two-component signaling proteins such as sensory histidine kinases and chemoreceptors. The landscape reveals the conformational variability of the HAMP domain, including rotations, shifts, displacements, and tilts of helices, many combinations of which have not been observed in experimental structures. HAMP domains belonging to a single family tend to occupy a defined region of the landscape, even when their sequence similarity is low, suggesting that individual HAMP families have evolved to operate in a specific conformational range. The functional importance of this structural conservation is illustrated by poly-HAMP arrays, in which HAMP domains from families with opposite conformational preferences alternate, consistent with the rotational model of signal transduction. The only poly-HAMP arrays that violate this rule are predicted to be of recent evolutionary origin and structurally unstable. Finally, we identify a family of HAMP domains that are likely to be dynamic due to the presence of a conserved pi-helical bulge. All code associated with this work, including a tool for rapid sequence-based prediction of the rotational state in HAMP domains, is deposited at https://github.com/labstructbioinf/HAMPpred.


Assuntos
Proteínas de Bactérias , Histidina , Proteínas de Bactérias/química , Conformação Molecular , Transdução de Sinais , Histidina Quinase/genética , Histidina Quinase/metabolismo
3.
Nat Commun ; 14(1): 7460, 2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38016962

RESUMO

Biological modularity enhances evolutionary adaptability. This principle is vividly exemplified by bacterial viruses (phages), which display extensive genomic modularity. Phage genomes are composed of independent functional modules that evolve separately and recombine in various configurations. While genomic modularity in phages has been extensively studied, less attention has been paid to protein modularity-proteins consisting of distinct building blocks that can evolve and recombine, enhancing functional and genetic diversity. Here, we use a set of 133,574 representative phage proteins and highly sensitive homology detection to capture instances of domain mosaicism, defined as fragment sharing between two otherwise unrelated proteins, and to understand its relationship with functional diversity in phage genomes. We discover that unrelated proteins from diverse functional classes frequently share homologous domains. This phenomenon is particularly pronounced within receptor-binding proteins, endolysins, and DNA polymerases. We also identify multiple instances of recent diversification via domain shuffling in receptor-binding proteins, neck passage structures, endolysins and some members of the core replication machinery, often transcending distant taxonomic and ecological boundaries. Our findings suggest that ongoing diversification via domain shuffling is reflective of a co-evolutionary arms race, driven by the need to overcome various bacterial resistance mechanisms against phages.


Assuntos
Bacteriófagos , Evolução Molecular , Evolução Biológica , Bacteriófagos/genética , Genômica , Bactérias/genética , Genoma Viral/genética , Filogenia
4.
Bioinformatics ; 39(10)2023 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-37725369

RESUMO

MOTIVATION: The detection of homology through sequence comparison is a typical first step in the study of protein function and evolution. In this work, we explore the applicability of protein language models to this task. RESULTS: We introduce pLM-BLAST, a tool inspired by BLAST, that detects distant homology by comparing single-sequence representations (embeddings) derived from a protein language model, ProtT5. Our benchmarks reveal that pLM-BLAST maintains a level of accuracy on par with HHsearch for both highly similar sequences (with >50% identity) and markedly divergent sequences (with <30% identity), while being significantly faster. Additionally, pLM-BLAST stands out among other embedding-based tools due to its ability to compute local alignments. We show that these local alignments, produced by pLM-BLAST, often connect highly divergent proteins, thereby highlighting its potential to uncover previously undiscovered homologous relationships and improve protein annotation. AVAILABILITY AND IMPLEMENTATION: pLM-BLAST is accessible via the MPI Bioinformatics Toolkit as a web server for searching precomputed databases (https://toolkit.tuebingen.mpg.de/tools/plmblast). It is also available as a standalone tool for building custom databases and performing batch searches (https://github.com/labstructbioinf/pLM-BLAST).


Assuntos
Proteínas , Software , Sequência de Aminoácidos , Alinhamento de Sequência , Proteínas/genética , Anotação de Sequência Molecular
5.
J Alzheimers Dis ; 89(4): 1211-1219, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36031890

RESUMO

BACKGROUND: Homozygous variants of the TREM2 and TYROBP genes have been shown to be causative for multiple bone cysts and neurodegeneration leading to progressive dementia (NHD, Nasu-Hakola disease). OBJECTIVE: To determine if biallelic variants of these genes and/or oligogenic inheritance could be responsible for a wider spectrum of neurodegenerative conditions. METHODS: We analyzed 52 genes associated with neurodegenerative disorders using targeted next generation sequencing in a selected group of 29 patients (n = 14 Alzheimer's disease, n = 8 frontotemporal dementia, n = 7 amyotrophic lateral sclerosis) carrying diverse already determined rare variants in exon 2 of TREM2. Molecular modeling was used to get an insight into the potential effects of the mutation. RESULTS: We identified a novel mutation c.401_406delinsTCTAT; p.(Asp134Valfs*55) in exon 3 of TREM2 in an Alzheimer's disease patient also carrying the p.Arg62His TREM2 variant. Molecular modeling revealed that the identified mutation prevents anchoring of the TREM2 protein in the membrane, leaving the core of the Ig-like domain intact. CONCLUSION: Our results expand the spectrum of neurodegenerative diseases, where the carriers of biallelic mutations in TREM2 have been described for Alzheimer's disease, and highlight the impact of variant burden in other genes on phenotypic heterogeneity.


Assuntos
Doença de Alzheimer , Glicoproteínas de Membrana , Doenças Neurodegenerativas , Osteocondrodisplasias , Receptores Imunológicos , Panencefalite Esclerosante Subaguda , Doença de Alzheimer/genética , Humanos , Lipodistrofia , Glicoproteínas de Membrana/genética , Doenças Neurodegenerativas/genética , Osteocondrodisplasias/genética , Receptores Imunológicos/genética , Panencefalite Esclerosante Subaguda/genética
6.
Cell Biosci ; 12(1): 34, 2022 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-35305696

RESUMO

BACKGROUND: Huntington's disease (HD) is a neurodegenerative disorder whereby mutated huntingtin protein (mHTT) aggregates when polyglutamine repeats in the N-terminal of mHTT exceeds 36 glutamines (Q). However, the mechanism of this pathology is unknown. Siah1-interacting protein (SIP) acts as an adaptor protein in the ubiquitination complex and mediates degradation of other proteins. We hypothesized that mHTT aggregation depends on the dysregulation of SIP activity in this pathway in HD. RESULTS: A higher SIP dimer/monomer ratio was observed in the striatum in young YAC128 mice, which overexpress mHTT. We found that SIP interacted with HTT. In a cellular HD model, we found that wildtype SIP increased mHTT ubiquitination, attenuated mHTT protein levels, and decreased HTT aggregation. We predicted mutations that should stabilize SIP dimerization and found that SIP mutant-overexpressing cells formed more stable dimers and had lower activity in facilitating mHTT ubiquitination and preventing exon 1 mHTT aggregation compared with wildtype SIP. CONCLUSIONS: Our data suggest that an increase in SIP dimerization in HD medium spiny neurons leads to a decrease in SIP function in the degradation of mHTT through a ubiquitin-proteasome pathway and consequently an increase in mHTT aggregation. Therefore, SIP could be considered a potential target for anti-HD therapy during the early stage of HD pathology.

7.
Bioinformatics ; 38(9): 2633-2635, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35199148

RESUMO

MOTIVATION: The wealth of protein structures collected in the Protein Data Bank enabled large-scale studies of their function and evolution. Such studies, however, require the generation of customized datasets combining the structural data with miscellaneous accessory resources providing functional, taxonomic and other annotations. Unfortunately, the functionality of currently available tools for the creation of such datasets is limited and their usage frequently requires laborious surveying of various data sources and resolving inconsistencies between their versions. RESULTS: To address this problem, we developed localpdb, a versatile Python library for the management of protein structures and their annotations. The library features a flexible plugin system enabling seamless unification of the structural data with diverse auxiliary resources, full version control and powerful functionality of creating highly customized datasets. The localpdb can be used in a wide range of bioinformatic tasks, in particular those involving large-scale protein structural analyses and machine learning. AVAILABILITY AND IMPLEMENTATION: localpdb is freely available at https://github.com/labstructbioinf/localpdb. Documentation along with the usage examples can be accessed at https://labstructbioinf.github.io/localpdb/.


Assuntos
Biologia Computacional , Software , Proteínas , Bases de Dados de Proteínas , Documentação
8.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34571541

RESUMO

The Rossmann fold enzymes are involved in essential biochemical pathways such as nucleotide and amino acid metabolism. Their functioning relies on interaction with cofactors, small nucleoside-based compounds specifically recognized by a conserved ßαß motif shared by all Rossmann fold proteins. While Rossmann methyltransferases recognize only a single cofactor type, the S-adenosylmethionine, the oxidoreductases, depending on the family, bind nicotinamide (nicotinamide adenine dinucleotide, nicotinamide adenine dinucleotide phosphate) or flavin-based (flavin adenine dinucleotide) cofactors. In this study, we showed that despite its short length, the ßαß motif unambiguously defines the specificity towards the cofactor. Following this observation, we trained two complementary deep learning models for the prediction of the cofactor specificity based on the sequence and structural features of the ßαß motif. A benchmark on two independent test sets, one containing ßαß motifs bearing no resemblance to those of the training set, and the other comprising 38 experimentally confirmed cases of rational design of the cofactor specificity, revealed the nearly perfect performance of the two methods. The Rossmann-toolbox protocols can be accessed via the webserver at https://lbs.cent.uw.edu.pl/rossmann-toolbox and are available as a Python package at https://github.com/labstructbioinf/rossmann-toolbox.


Assuntos
Aprendizado Profundo , Flavina-Adenina Dinucleotídeo/química , Flavina-Adenina Dinucleotídeo/metabolismo , NAD/química , NAD/metabolismo , NADP/química , NADP/metabolismo , Proteínas
9.
Int J Mol Sci ; 22(24)2021 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-34948248

RESUMO

The bacterial proteins of the Dsb family catalyze the formation of disulfide bridges between cysteine residues that stabilize protein structures and ensure their proper functioning. Here, we report the detailed analysis of the Dsb pathway of Campylobacter jejuni. The oxidizing Dsb system of this pathogen is unique because it consists of two monomeric DsbAs (DsbA1 and DsbA2) and one dimeric bifunctional protein (C8J_1298). Previously, we showed that DsbA1 and C8J_1298 are redundant. Here, we unraveled the interaction between the two monomeric DsbAs by in vitro and in vivo experiments and by solving their structures and found that both monomeric DsbAs are dispensable proteins. Their structures confirmed that they are homologs of EcDsbL. The slight differences seen in the surface charge of the proteins do not affect the interaction with their redox partner. Comparative proteomics showed that several respiratory proteins, as well as periplasmic transport proteins, are targets of the Dsb system. Some of these, both donors and electron acceptors, are essential elements of the C. jejuni respiratory process under oxygen-limiting conditions in the host intestine. The data presented provide detailed information on the function of the C. jejuni Dsb system, identifying it as a potential target for novel antibacterial molecules.


Assuntos
Oxirredutases/metabolismo , Proteínas Periplásmicas/metabolismo , Isomerases de Dissulfetos de Proteínas/genética , Isomerases de Dissulfetos de Proteínas/metabolismo , Sequência de Aminoácidos , Fenômenos Fisiológicos Bacterianos , Proteínas de Bactérias/metabolismo , Campylobacter jejuni/patogenicidade , Campylobacter jejuni/fisiologia , Dissulfetos/metabolismo , Oxirredução , Oxirredutases/genética , Periplasma/metabolismo , Proteínas Periplásmicas/genética , Homologia de Sequência de Aminoácidos
10.
PLoS Comput Biol ; 17(10): e1009502, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34648493

RESUMO

While the slipknot topology in proteins has been known for over a decade, its evolutionary origin is still a mystery. We have identified a previously overlooked slipknot motif in a family of two-domain membrane transporters. Moreover, we found that these proteins are homologous to several families of unknotted membrane proteins. This allows us to directly investigate the evolution of the slipknot motif. Based on our comprehensive analysis of 17 distantly related protein families, we have found that slipknotted and unknotted proteins share a common structural motif. Furthermore, this motif is conserved on the sequential level as well. Our results suggest that, regardless of topology, the proteins we studied evolved from a common unknotted ancestor single domain protein. Our phylogenetic analysis suggests the presence of at least seven parallel evolutionary scenarios that led to the current diversity of proteins in question. The tools we have developed in the process can now be used to investigate the evolution of other repeated-domain proteins.


Assuntos
Antiporters , Evolução Molecular , Motivos de Aminoácidos , Antiporters/química , Antiporters/genética , Antiporters/metabolismo , Biologia Computacional , Bases de Dados de Proteínas , Filogenia , Conformação Proteica
11.
Bioinformatics ; 36(22-23): 5368-5376, 2021 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-33325494

RESUMO

MOTIVATION: Coiled coils are widespread protein domains involved in diverse processes ranging from providing structural rigidity to the transduction of conformational changes. They comprise two or more α-helices that are wound around each other to form a regular supercoiled bundle. Owing to this regularity, coiled-coil structures can be described with parametric equations, thus enabling the numerical representation of their properties, such as the degree and handedness of supercoiling, rotational state of the helices, and the offset between them. These descriptors are invaluable in understanding the function of coiled coils and designing new structures of this type. The existing tools for such calculations require manual preparation of input and are therefore not suitable for the high-throughput analyses. RESULTS: To address this problem, we developed SamCC-Turbo, a software for fully automated, per-residue measurement of coiled coils. By surveying Protein Data Bank with SamCC-Turbo, we generated a comprehensive atlas of ∼50 000 coiled-coil regions. This machine learning-ready dataset features precise measurements as well as decomposes coiled-coil structures into fragments characterized by various degrees of supercoiling. The potential applications of SamCC-Turbo are exemplified by analyses in which we reveal general structural features of coiled coils involved in functions requiring conformational plasticity. Finally, we discuss further directions in the prediction and modeling of coiled coils. AVAILABILITY AND IMPLEMENTATION: SamCC-Turbo is available as a web server (https://lbs.cent.uw.edu.pl/samcc_turbo) and as a Python library (https://github.com/labstructbioinf/samcc_turbo), whereas the results of the Protein Data Bank scan can be browsed and downloaded at https://lbs.cent.uw.edu.pl/ccdb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

12.
BMC Bioinformatics ; 21(1): 179, 2020 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-32381046

RESUMO

BACKGROUND: Protein repeats can confound sequence analyses because the repetitiveness of their amino acid sequences lead to difficulties in identifying whether similar repeats are due to convergent or divergent evolution. We noted that the patterns derived from traditional "dot plot" protein sequence self-similarity analysis tended to be conserved in sets of related repeat proteins and this conservation could be quantitated using a Jaccard metric. RESULTS: Comparison of these dot plots obviated the issues due to sequence similarity for analysis of repeat proteins. A high Jaccard similarity score was suggestive of a conserved relationship between closely related repeat proteins. The dot plot patterns decayed quickly in the absence of selective pressure with an expected loss of 50% of Jaccard similarity due to a loss of 8.2% sequence identity. To perform method testing, we assembled a standard set of 79 repeat proteins representing all the subgroups in RepeatsDB. Comparison of known repeat and non-repeat proteins from the PDB suggested that the information content in dot plots could be used to identify repeat proteins from pure sequence with no requirement for structural information. Analysis of the UniRef90 database suggested that 16.9% of all known proteins could be classified as repeat proteins. These 13.3 million putative repeat protein chains were clustered and a significant amount (82.9%) of clusters containing between 5 and 200 members were of a single functional type. CONCLUSIONS: Dot plot analysis of repeat proteins attempts to obviate issues that arise due to the sequence degeneracy of repeat proteins. These results show that this kind of analysis can efficiently be applied to analyze repeat proteins on a large scale.


Assuntos
Sequência Conservada , Evolução Molecular , Proteínas/química , Sequências Repetitivas de Aminoácidos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Mutação/genética
13.
PLoS One ; 15(3): e0230366, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32203539

RESUMO

Posttranslational generation of disulfide bonds catalyzed by bacterial Dsb (disulfide bond) enzymes is essential for the oxidative folding of many proteins. Although we now have a good understanding of the Escherichia coli disulfide bond formation system, there are significant gaps in our knowledge concerning the Dsb systems of other bacteria, including Campylobacter jejuni, a food-borne, zoonotic pathogen. We attempted to gain a more complete understanding of the process by thorough analysis of C8J_1298 functioning in vitro and in vivo. C8J_1298 is a homodimeric thiol-oxidoreductase present in wild type (wt) cells, in both reduced and oxidized forms. The protein was previously described as a homolog of DsbC, and thus potentially should be active in rearrangement of disulfides. Indeed, biochemical studies with purified protein revealed that C8J_1298 shares many properties with EcDsbC. However, its activity in vivo is dependent on the genetic background, namely, the set of other Dsb proteins present in the periplasm that determine the redox conditions. In wt C. jejuni cells, C8J_1298 potentially works as a DsbG involved in the control of the cysteine sulfenylation level and protecting single cysteine residues from oxidation to sulfenic acid. A strain lacking only C8J_1298 is indistinguishable from the wild type strain by several assays recognized as the criteria to determine isomerization or oxidative Dsb pathways. Remarkably, in C. jejuni strain lacking DsbA1, the protein involved in generation of disulfides, C8J_1298 acts as an oxidase, similar to the homodimeric oxidoreductase of Helicobater pylori, HP0231. In E. coli, C8J_1298 acts as a bifunctional protein, also resembling HP0231. These findings are strongly supported by phylogenetic data. We also showed that CjDsbD (C8J_0565) is a C8J_1298 redox partner.


Assuntos
Campylobacter jejuni/enzimologia , Dissulfetos/metabolismo , Proteínas Periplásmicas/metabolismo , Proteína Dissulfeto Redutase (Glutationa)/metabolismo , Sequência de Aminoácidos , Campylobacter jejuni/genética , Escherichia coli/enzimologia , Escherichia coli/genética , Helicobacter pylori/enzimologia , Helicobacter pylori/genética , Oxirredução , Periplasma/enzimologia , Proteínas Periplásmicas/genética , Filogenia , Proteína Dissulfeto Redutase (Glutationa)/genética
14.
Sci Rep ; 9(1): 11533, 2019 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-31395899

RESUMO

LGMD2L is a subtype of limb-girdle muscular dystrophy (LGMD), caused by recessive mutations in ANO5, encoding anoctamin-5 (ANO5). We present the analysis of five patients with skeletal muscle weakness for whom heterozygous mutations within ANO5 were identified by whole exome sequencing (WES). Patients varied in the age of the disease onset (from 22 to 38 years) and severity of the morphological and clinical phenotypes. Out of the nine detected mutations one was novel (missense p.Lys132Met, accompanied by p.His841Asp) and one was not yet characterized in the literature (nonsense, p.Trp401Ter, accompanied by p.Asp81Gly). The p.Asp81Gly mutation was also identified in another patient carrying a p.Arg758Cys mutation as well. Also, a c.191dupA frameshift (p.Asn64LysfsTer15), the first described and common mutation was identified. Mutations were predicted by in silico tools to have damaging effects and are likely pathogenic according to criteria of the American College of Medical Genetics and Genomics (ACMG). Indeed, molecular modeling of mutations revealed substantial changes in ANO5 conformation that could affect the protein structure and function. In addition, variants in other genes associated with muscle pathology were identified, possibly affecting the disease progress. The presented data indicate that the identified ANO5 mutations contribute to the observed muscle pathology and broaden the genetic spectrum of LGMD myopathies.


Assuntos
Anoctaminas/ultraestrutura , Predisposição Genética para Doença , Músculo Esquelético/ultraestrutura , Distrofia Muscular do Cíngulo dos Membros/genética , Adulto , Anoctaminas/genética , Canais de Cloreto/genética , Biologia Computacional , Feminino , Heterozigoto , Humanos , Masculino , Músculo Esquelético/diagnóstico por imagem , Músculo Esquelético/patologia , Distrofia Muscular do Cíngulo dos Membros/diagnóstico por imagem , Distrofia Muscular do Cíngulo dos Membros/patologia , Mutação/genética , Fenótipo , Polônia/epidemiologia , Adulto Jovem
15.
Sci Rep ; 9(1): 6888, 2019 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-31053765

RESUMO

Canonical π-helices are short, relatively unstable secondary structure elements found in proteins. They comprise seven or more residues and are present in 15% of all known protein structures, often in functionally important regions such as ligand- and ion-binding sites. Given their similarity to α-helices, the prediction of π-helices is a challenging task and none of the currently available secondary structure prediction methods tackle it. Here, we present PiPred, a neural network-based tool for predicting π-helices in protein sequences. By performing a rigorous benchmark we show that PiPred can detect π-helices with a per-residue precision of 48% and sensitivity of 46%. Interestingly, some of the α-helices mispredicted by PiPred as π-helices exhibit a geometry characteristic of π-helices. Also, despite being trained only with canonical π-helices, PiPred can identify 6-residue-long α/π-bulges. These observations suggest an even higher effective precision of the method and demonstrate that π-helices, α/π-bulges, and other helical deformations may impose similar constraints on sequences. PiPred is freely accessible at: https://toolkit.tuebingen.mpg.de/#/tools/quick2d . A standalone version is available for download at: https://github.com/labstructbioinf/PiPred , where we also provide the CB6133, CB513, CASP10, and CASP11 datasets, commonly used for training and validation of secondary structure prediction methods, with correctly annotated π-helices.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Proteínas/química , Sequência de Aminoácidos , Modelos Moleculares , Conformação Proteica em alfa-Hélice
16.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30649297

RESUMO

RNA-recognition motif (RRM) is an RNA-interacting protein domain that plays an important role in the processes of RNA metabolism such as the splicing, editing, export, degradation, and regulation of translation. Here, we present the RNA-recognition motif database (RRMdb), which affords rapid identification and annotation of RRM domains in a given protein sequence. The RRMdb database is compiled from ~57 000 collected representative RRM domain sequences, classified into 415 families. Whenever possible, the families are associated with the available literature and structural data. Moreover, the RRM families are organized into a network of sequence similarities that allows for the assessment of the evolutionary relationships between them.


Assuntos
Bases de Dados de Proteínas , Motivo de Reconhecimento de RNA , Análise de Sequência de Proteína/métodos , Software , Internet
17.
Bioinformatics ; 35(16): 2790-2795, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30601942

RESUMO

MOTIVATION: Coiled coils are protein structural domains that mediate a plethora of biological interactions, and thus their reliable annotation is crucial for studies of protein structure and function. RESULTS: Here, we report DeepCoil, a new neural network-based tool for the detection of coiled-coil domains in protein sequences. In our benchmarks, DeepCoil significantly outperformed current state-of-the-art tools, such as PCOILS and Marcoil, both in the prediction of canonical and non-canonical coiled coils. Furthermore, in a scan of the human genome with DeepCoil, we detected many coiled-coil domains that remained undetected by other methods. This higher sensitivity of DeepCoil should make it a method of choice for accurate genome-wide detection of coiled-coil domains. AVAILABILITY AND IMPLEMENTATION: DeepCoil is written in Python and utilizes the Keras machine learning library. A web server is freely available at https://toolkit.tuebingen.mpg.de/#/tools/deepcoil and a standalone version can be downloaded at https://github.com/labstructbioinf/DeepCoil. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Sequência de Aminoácidos , Humanos , Aprendizado de Máquina , Domínios Proteicos , Proteínas
18.
J Struct Biol ; 204(1): 117-124, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30042011

RESUMO

In protein modelling and design, an understanding of the relationship between sequence and structure is essential. Using parallel, homotetrameric coiled-coil structures as a model system, we demonstrated that machine learning techniques can be used to predict structural parameters directly from the sequence. Coiled coils are regular protein structures, which are of great interest as building blocks for assembling larger nanostructures. They are composed of two or more alpha-helices wrapped around each other to form a supercoiled bundle. The coiled-coil bundles are defined by four basic structural parameters: topology (parallel or antiparallel), radius, degree of supercoiling, and the rotation of helices around their axes. In parallel coiled coils the latter parameter, describing the hydrophobic core packing geometry, was assumed to show little variation. However, we found that subtle differences between structures of this type were not artifacts of structure determination and could be predicted directly from the sequence. Using this information in modelling narrows the structural parameter space that must be searched and thus significantly reduces the required computational time. Moreover, the sequence-structure rules can be used to explain the effects of point mutations and to shed light on the relationship between hydrophobic core architecture and coiled-coil topology.


Assuntos
Proteínas/química , Interações Hidrofóbicas e Hidrofílicas , Aprendizado de Máquina , Estrutura Secundária de Proteína
19.
PLoS One ; 13(4): e0195358, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29677198

RESUMO

Helicobacter pylori HP0377 is a thiol oxidoreductase, a member of the CcmG family involved in cytochrome biogenesis, as previously shown by in vitro experiments. In this report, we document that HP0377 also acts in vivo in the cytochrome assembly process in Bacillus subtilis, where it complements the lack of ResA. However, unlike other characterized proteins in this family, HP0377 is a dithiol reductase and isomerase. We elucidated how the amino acid composition of its active site modulates its functionality. We demonstrated that cis-proline (P156) is involved in its interaction with the redox partner (CcdA), as a P156T HP0377 variant is inactive in vivo and is present in the oxidized form in B. subtilis. Furthermore, we showed that engineering the HP0377 active motif by changing CSYC motif into CSYS or SSYC, clearly diminishes two activities (reduction and isomerization) of the protein. Whereas HP0377CSYA is inactive in reduction as well as in isomerization, HP0377CSYS retains reductive activity. Also, replacement of F95 by Q decreases its ability to regenerate scRNase and does not influence the reductive activity of HP0377CSYS towards apocytochrome c. HP0377 is also distinguished from other CcmGs as it forms a 2:1 complex with apocytochrome c. Phylogenetic analyses showed that, although HP0377 is capable of complementing ResA in Bacillus subtilis, its thioredoxin domain has a different origin, presumably common to DsbC.


Assuntos
Proteínas de Bactérias/metabolismo , Helicobacter pylori/enzimologia , Oxirredutases/metabolismo , Motivos de Aminoácidos , Sequência de Aminoácidos , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Proteínas de Bactérias/genética , Clonagem Molecular , Biologia Computacional , Citocromos c/metabolismo , Escherichia coli , Helicobacter pylori/genética , Isoenzimas , Mutagênese , Oxirredução , Oxirredutases/genética , Filogenia
20.
J Struct Biol ; 203(1): 54-61, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29454111

RESUMO

Computational protein design is a set of procedures for computing amino acid sequences that will fold into a specified structure. Rosetta Design, a commonly used software for protein design, allows for the effective identification of sequences compatible with a given backbone structure, while molecular dynamics (MD) simulations can thoroughly sample near-native conformations. We benchmarked a procedure in which Rosetta design is started on MD-derived structural ensembles and showed that such a combined approach generates 20-30% more diverse sequences than currently available methods with only a slight increase in computation time. Importantly, the increase in diversity is achieved without a loss in the quality of the designed sequences assessed by their resemblance to natural sequences. We demonstrate that the MD-based procedure is also applicable to de novo design tasks started from backbone structures without any sequence information. In addition, we implemented a protocol that can be used to assess the stability of designed models and to select the best candidates for experimental validation. In sum our results demonstrate that the MD ensemble-based flexible backbone design can be a viable method for protein design, especially for tasks that require a large pool of diverse sequences.


Assuntos
Simulação de Dinâmica Molecular , Engenharia de Proteínas/métodos , Software , Sequência de Aminoácidos , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...