Pesquisa | Portal Regional da BVS (teste)

PanPA: generation and alignment of panproteome graphs.

Dabbaghie, Fawaz; Srikakulam, Sanjay K; Marschall, Tobias; Kalinina, Olga V.

Bioinform Adv ; 3(1): vbad167, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-38145107

RESUMO

Motivation: Compared to eukaryotes, prokaryote genomes are more diverse through different mechanisms, including a higher mutation rate and horizontal gene transfer. Therefore, using a linear representative reference can cause a reference bias. Graph-based pangenome methods have been developed to tackle this problem. However, comparisons in DNA space are still challenging due to this high diversity. In contrast, amino acid sequences have higher similarity due to evolutionary constraints, whereby a single amino acid may be encoded by several synonymous codons. Coding regions cover the majority of the genome in prokaryotes. Thus, panproteomes present an attractive alternative leveraging the higher sequence similarity while not losing much of the genome in non-coding regions. Results: We present PanPA, a method that takes a set of multiple sequence alignments of protein sequences, indexes them, and builds a graph for each multiple sequence alignment. In the querying step, it can align DNA or amino acid sequences back to these graphs. We first showcase that PanPA generates correct alignments on a panproteome from 1350 Escherichia coli. To demonstrate that panproteomes allow comparisons at longer phylogenetic distances, we compare DNA and protein alignments from 1073 Salmonella enterica assemblies against E.coli reference genome, pangenome, and panproteome using BWA, GraphAligner, and PanPA, respectively; with PanPA aligning around 22% more sequences. We also aligned a DNA short-reads whole genome sequencing (WGS) sample from S.enterica against the E.coli reference with BWA and the panproteome with PanPA, where PanPA was able to find alignment for 68% of the reads compared to 5% with BWA. Availalability and implementation: PanPA is available at https://github.com/fawaz-dabbaghieh/PanPA.

MetaProFi: an ultrafast chunked Bloom filter for storing and querying protein and nucleotide sequence data for accurate identification of functionally relevant genetic variants.

Srikakulam, Sanjay K; Keller, Sebastian; Dabbaghie, Fawaz; Bals, Robert; Kalinina, Olga V.

Bioinformatics ; 39(3)2023 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-36825843

RESUMO

MOTIVATION: Bloom filters are a popular data structure that allows rapid searches in large sequence datasets. So far, all tools work with nucleotide sequences; however, protein sequences are conserved over longer evolutionary distances, and only mutations on the protein level may have any functional significance. RESULTS: We present MetaProFi, a Bloom filter-based tool that, for the first time, offers the functionality to build indexes of amino acid sequences and query them with both amino acid and nucleotide sequences, thus bringing sequence comparison to the biologically relevant protein level. MetaProFi implements additional efficient engineering solutions, such as a shared memory system, chunked data storage and efficient compression. In addition to its conceptual novelty, MetaProFi demonstrates state-of-the-art performance and excellent memory consumption-to-speed ratio when applied to various large datasets. AVAILABILITY AND IMPLEMENTATION: Source code in Python is available at https://github.com/kalininalab/metaprofi.

Assuntos

Algoritmos , Compressão de Dados , Sequência de Bases , Software , Proteínas

d-StructMAn: Containerized structural annotation on the scale from genetic variants to whole proteomes.

Gress, Alexander; Srikakulam, Sanjay K; Keller, Sebastian; Ramensky, Vasily; Kalinina, Olga V.

Gigascience ; 112022 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-36130085

RESUMO

BACKGROUND: Structural annotation of genetic variants in the context of intermolecular interactions and protein stability can shed light onto mechanisms of disease-related phenotypes. Three-dimensional structures of related proteins in complexes with other proteins, nucleic acids, or ligands enrich such functional interpretation, since intermolecular interactions are well conserved in evolution. RESULTS: We present d-StructMAn, a novel computational method that enables structural annotation of local genetic variants, such as single-nucleotide variants and in-frame indels, and implements it in a highly efficient and user-friendly tool provided as a Docker container. Using d-StructMAn, we annotated several very large sets of human genetic variants, including all variants from ClinVar and all amino acid positions in the human proteome. We were able to provide annotation for more than 46% of positions in the human proteome representing over 60% proteins. CONCLUSIONS: d-StructMAn is the first of its kind and a highly efficient tool for structural annotation of protein-coding genetic variation in the context of observed and potential intermolecular interactions. d-StructMAn is readily applicable to proteome-scale datasets and can be an instrumental building machine-learning tool for predicting genotype-to-phenotype relationships.

Assuntos

Ácidos Nucleicos , Proteoma , Aminoácidos , Variação Genética , Humanos , Anotação de Sequência Molecular , Nucleotídeos

Epistatic interactions promote persistence of NS3-Q80K in HCV infection by compensating for protein folding instability.

Dultz, Georg; Srikakulam, Sanjay K; Konetschnik, Michael; Shimakami, Tetsuro; Doncheva, Nadezhda T; Dietz, Julia; Sarrazin, Christoph; Biondi, Ricardo M; Zeuzem, Stefan; Tampé, Robert; Kalinina, Olga V; Welsch, Christoph.

J Biol Chem ; 297(3): 101031, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34339738

RESUMO

The Q80K polymorphism in the NS3-4A protease of the hepatitis C virus is associated with treatment failure of direct-acting antiviral agents. This polymorphism is highly prevalent in genotype 1a infections and stably transmitted between hosts. Here, we investigated the underlying molecular mechanisms of evolutionarily conserved coevolving amino acids in NS3-Q80K and revealed potential implications of epistatic interactions in immune escape and variants persistence. Using purified protein, we characterized the impact of epistatic amino acid substitutions on the physicochemical properties and peptide cleavage kinetics of the NS3-Q80K protease. We found that Q80K destabilized the protease protein fold (p < 0.0001). Although NS3-Q80K showed reduced peptide substrate turnover (p < 0.0002), replicative fitness in an H77S.3 cell culture model of infection was not significantly inferior to the WT virus. Epistatic substitutions at residues 91 and 174 in NS3-Q80K stabilized the protein fold (p < 0.0001) and leveraged the WT protease stability. However, changes in protease stability inversely correlated with enzymatic activity. In infectious cell culture, these secondary substitutions were not associated with a gain of replicative fitness in NS3-Q80K variants. Using molecular dynamics, we observed that the total number of residue contacts in NS3-Q80K mutants correlated with protein folding stability. Changes in the number of contacts reflected the compensatory effect on protein folding instability by epistatic substitutions. In summary, epistatic substitutions in NS3-Q80K contribute to viral fitness by mechanisms not directly related to RNA replication. By compensating for protein-folding instability, epistatic interactions likely protect NS3-Q80K variants from immune cell recognition.

Assuntos

Epistasia Genética , Hepacivirus/genética , Hepatite C/virologia , Substituição de Aminoácidos , Genes Virais , Humanos , Simulação de Dinâmica Molecular , Mutação , Polimorfismo Genético , Proteínas não Estruturais Virais/química , Proteínas não Estruturais Virais/genética

A shift of dynamic equilibrium between the KIT active and inactive states causes drug resistance.

Srikakulam, Sanjay K; Bastys, Tomas; Kalinina, Olga V.

Proteins ; 88(11): 1434-1446, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32530065

RESUMO

Tyrosine phosphorylation, a highly regulated post-translational modification, is carried out by the enzyme tyrosine kinase (TK). TKs are important mediators in signaling cascades, facilitating diverse biological processes in response to stimuli. TKs may acquire mutations leading to malignancy and are viable targets for anti-cancer drugs. Mast/stem cell growth factor receptor KIT is a TK involved in cell differentiation, whose dysregulation leads to various types of cancer, including gastrointestinal stromal tumors, leukemia, and melanoma. KIT can be targeted by a range of inhibitors that predominantly bind to the inactive state of the enzyme. A mutation Y823D in the activation loop of KIT is known to be responsible for the loss of sensitivity to some drugs in metastatic tumors. We used all-atom molecular dynamics simulations to study the impact of Y823D on the KIT conformation and dynamics and compared it to the effect of phosphorylation of Y823. We simulated in total 6.4 µs of wild-type, mutant and phosphorylated KIT in the active- and inactive-state conformations. We found that Y823D affects the protein dynamics differently: in the active state, the mutation increases the protein stability, whereas in the inactive state it induces local destabilization, thus shifting the dynamic equilibrium towards the active state, altering the communication between distant regulatory regions. The observed dynamics of the Y823D mutant is similar to the dynamics of KIT phosphorylated at position Y823, thus we hypothesize that this mutation mimics a constitutively active kinase, which is not responsive to inhibitors that bind its inactive conformation.

Assuntos

Antineoplásicos/química , Ácido Aspártico/química , Inibidores de Proteínas Quinases/química , Processamento de Proteína Pós-Traducional , Proteínas Proto-Oncogênicas c-kit/química , Tirosina/química , Antineoplásicos/metabolismo , Ácido Aspártico/metabolismo , Sítios de Ligação , Bases de Dados de Proteínas , Resistencia a Medicamentos Antineoplásicos/genética , Humanos , Ligação de Hidrogênio , Ligantes , Simulação de Dinâmica Molecular , Mutação , Fosforilação , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Inibidores de Proteínas Quinases/metabolismo , Estabilidade Proteica , Proteínas Proto-Oncogênicas c-kit/antagonistas & inibidores , Proteínas Proto-Oncogênicas c-kit/genética , Proteínas Proto-Oncogênicas c-kit/metabolismo , Especificidade por Substrato , Termodinâmica , Tirosina/metabolismo

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA