Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 121(21): e2400260121, 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38743624

RESUMO

We introduce ZEPPI (Z-score Evaluation of Protein-Protein Interfaces), a framework to evaluate structural models of a complex based on sequence coevolution and conservation involving residues in protein-protein interfaces. The ZEPPI score is calculated by comparing metrics for an interface to those obtained from randomly chosen residues. Since contacting residues are defined by the structural model, this obviates the need to account for indirect interactions. Further, although ZEPPI relies on species-paired multiple sequence alignments, its focus on interfacial residues allows it to leverage quite shallow alignments. ZEPPI can be implemented on a proteome-wide scale and is applied here to millions of structural models of dimeric complexes in the Escherichia coli and human interactomes found in the PrePPI database. PrePPI's scoring function is based primarily on the evaluation of protein-protein interfaces, and ZEPPI adds a new feature to this analysis through the incorporation of evolutionary information. ZEPPI performance is evaluated through applications to experimentally determined complexes and to decoys from the CASP-CAPRI experiment. As we discuss, the standard CAPRI scores used to evaluate docking models are based on model quality and not on the ability to give yes/no answers as to whether two proteins interact. ZEPPI is able to detect weak signals from PPI models that the CAPRI scores define as incorrect and, similarly, to identify potential PPIs defined as low confidence by the current PrePPI scoring function. A number of examples that illustrate how the combination of PrePPI and ZEPPI can yield functional hypotheses are provided.


Assuntos
Proteoma , Proteoma/metabolismo , Humanos , Mapeamento de Interação de Proteínas/métodos , Modelos Moleculares , Escherichia coli/metabolismo , Escherichia coli/genética , Bases de Dados de Proteínas , Ligação Proteica , Proteínas de Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Proteínas/química , Proteínas/metabolismo , Alinhamento de Sequência
2.
Res Sq ; 2023 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-37790387

RESUMO

We introduce ZEPPI (Z-score Evaluation of Protein-Protein Interfaces), a framework to evaluate structural models of a complex based on sequence co-evolution and conservation involving residues in protein-protein interfaces. The ZEPPI score is calculated by comparing metrics for an interface to those obtained from randomly chosen residues. Since contacting residues are defined by the structural model, this obviates the need to account for indirect interactions. Further, although ZEPPI relies on species-paired multiple sequence alignments, its focus on interfacial residues allows it to leverage quite shallow alignments. ZEPPI performance is evaluated through applications to experimentally determined complexes and to decoys from the CASP-CAPRI experiment. ZEPPI can be implemented on a proteome-wide scale as evidenced by calculations on millions of structural models of dimeric complexes in the E. coli and human interactomes found in the PrePPI database. A number of examples that illustrate how these tools can yield novel functional hypotheses are provided.

3.
bioRxiv ; 2023 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-36909476

RESUMO

We present an updated version of the Predicting Protein-Protein Interactions (PrePPI) webserver which predicts PPIs on a proteome-wide scale. PrePPI combines structural and non-structural clues within a Bayesian framework to compute a likelihood ratio (LR) for essentially every possible pair of proteins in a proteome; the current database is for the human interactome. The structural modeling (SM) clue is derived from templatebased modeling and its application on a proteome-wide scale is enabled by a unique scoring function used to evaluate a putative complex. The updated version of PrePPI leverages AlphaFold structures that are parsed into individual domains. As has been demonstrated in earlier applications, PrePPI performs extremely well as measured by receiver operating characteristic curves derived from testing on E. coli and human protein-protein interaction (PPI) databases. A PrePPI database of ~1.3 million human PPIs can be queried with a webserver application that comprises multiple functionalities for examining query proteins, template complexes, 3D models for predicted complexes, and related features ( https://honiglab.c2b2.columbia.edu/PrePPI ). PrePPI is a state-of- the-art resource that offers an unprecedented structure-informed view of the human interactome.

4.
J Mol Biol ; 435(14): 168052, 2023 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-36933822

RESUMO

We present an updated version of the Predicting Protein-Protein Interactions (PrePPI) webserver which predicts PPIs on a proteome-wide scale. PrePPI combines structural and non-structural evidence within a Bayesian framework to compute a likelihood ratio (LR) for essentially every possible pair of proteins in a proteome; the current database is for the human interactome. The structural modeling (SM) component is derived from template-based modeling and its application on a proteome-wide scale is enabled by a unique scoring function used to evaluate a putative complex. The updated version of PrePPI leverages AlphaFold structures that are parsed into individual domains. As has been demonstrated in earlier applications, PrePPI performs extremely well as measured by receiver operating characteristic curves derived from testing on E. coli and human protein-protein interaction (PPI) databases. A PrePPI database of ∼1.3 million human PPIs can be queried with a webserver application that comprises multiple functionalities for examining query proteins, template complexes, 3D models for predicted complexes, and related features (https://honiglab.c2b2.columbia.edu/PrePPI). PrePPI is a state-of-the-art resource that offers an unprecedented structure-informed view of the human interactome.


Assuntos
Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas , Proteoma , Humanos , Teorema de Bayes , Escherichia coli/metabolismo , Proteoma/metabolismo
5.
Protein Sci ; 32(4): e4594, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36776141

RESUMO

We describe the Predicting Protein-Compound Interactions (PrePCI) database which comprises over 5 billion predicted interactions between 6.8 million chemical compounds and 19,797 human proteins. PrePCI relies on a proteome-wide database of structural models based on both traditional modeling techniques and the AlphaFold Protein Structure Database. Sequence- and structural similarity-based metrics are established between template proteins, T, in the Protein Data Bank that bind compounds, C, and query proteins in the model database, Q. When the metrics exceed threshold values, it is assumed that C also binds to Q with a likelihood ratio (LR) derived from machine learning. If the relationship is based on structural similarity, the LR is based on a scoring function that measures the extent to which C is compatible with the binding site of Q as described in the LT-scanner algorithm. For every predicted complex derived in this way, chemical similarity based on the Tanimoto coefficient identifies other small molecules that may bind to Q. An overall LR for the binding of C to Q is obtained from Naive Bayesian statistics. The PrePCI database can be queried by entering a UniProt ID or gene name for a protein to obtain a list of compounds predicted to bind to it along with associated LRs. Alternatively, entering an identifier for the compound outputs a list of proteins it is predicted to bind. Specific applications of the database to lead discovery, elucidation of drug mechanism of action, and biological function annotation are described.


Assuntos
Bases de Dados de Compostos Químicos , Proteínas , Humanos , Teorema de Bayes , Proteínas/química , Algoritmos , Bases de Dados de Proteínas
6.
J Biol Chem ; 296: 100562, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33744294

RESUMO

Systems biology is a data-heavy field that focuses on systems-wide depictions of biological phenomena necessarily sacrificing a detailed characterization of individual components. As an example, genome-wide protein interaction networks are widely used in systems biology and continuously extended and refined as new sources of evidence become available. Despite the vast amount of information about individual protein structures and protein complexes that has accumulated in the past 50 years in the Protein Data Bank, the data, computational tools, and language of structural biology are not an integral part of systems biology. However, increasing effort has been devoted to this integration, and the related literature is reviewed here. Relationships between proteins that are detected via structural similarity offer a rich source of information not available from sequence similarity, and homology modeling can be used to leverage Protein Data Bank structures to produce 3D models for a significant fraction of many proteomes. A number of structure-informed genomic and cross-species (i.e., virus-host) interactomes will be described, and the unique information they provide will be illustrated with a number of examples. Tissue- and tumor-specific interactomes have also been developed through computational strategies that exploit patient information and through genetic interactions available from increasingly sensitive screens. Strategies to integrate structural information with these alternate data sources will be described. Finally, efforts to link protein structure space with chemical compound space offer novel sources of information in drug design, off-target identification, and the identification of targets for compounds found to be effective in phenotypic screens.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Biologia de Sistemas , Conformação Proteica , Mapas de Interação de Proteínas
7.
Hum Genet ; 139(11): 1443-1454, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32514796

RESUMO

Dilated cardiomyopathy (DCM) belongs to the most frequent forms of cardiomyopathy mainly characterized by cardiac dilatation and reduced systolic function. Although most cases of DCM are classified as sporadic, 20-30% of cases show a heritable pattern. Familial forms of DCM are genetically heterogeneous, and mutations in several genes have been identified that most commonly play a role in cytoskeleton and sarcomere-associated processes. Still, a large number of familial cases remain unsolved. Here, we report five individuals from three independent families who presented with severe dilated cardiomyopathy during the neonatal period. Using whole-exome sequencing (WES), we identified causative, compound heterozygous missense variants in RPL3L (ribosomal protein L3-like) in all the affected individuals. The identified variants co-segregated with the disease in each of the three families and were absent or very rare in the human population, in line with an autosomal recessive inheritance pattern. They are located within the conserved RPL3 domain of the protein and were classified as deleterious by several in silico prediction software applications. RPL3L is one of the four non-canonical riboprotein genes and it encodes the 60S ribosomal protein L3-like protein that is highly expressed only in cardiac and skeletal muscle. Three-dimensional homology modeling and in silico analysis of the affected residues in RPL3L indicate that the identified changes specifically alter the interaction of RPL3L with the RNA components of the 60S ribosomal subunit and thus destabilize its binding to the 60S subunit. In conclusion, we report that bi-allelic pathogenic variants in RPL3L are causative of an early-onset, severe neonatal form of dilated cardiomyopathy, and we show for the first time that cytoplasmic ribosomal proteins are involved in the pathogenesis of non-syndromic cardiomyopathies.


Assuntos
Cardiomiopatia Dilatada/genética , Mutação de Sentido Incorreto/genética , Proteínas Ribossômicas/genética , Ribossomos/genética , Alelos , Exoma/genética , Feminino , Coração/fisiopatologia , Humanos , Lactente , Recém-Nascido , Masculino , Músculo Esquelético/fisiopatologia , Linhagem , Fenótipo , RNA/genética , Proteína Ribossômica L3
8.
Proc Natl Acad Sci U S A ; 114(52): 13685-13690, 2017 12 26.
Artigo em Inglês | MEDLINE | ID: mdl-29229851

RESUMO

We report a template-based method, LT-scanner, which scans the human proteome using protein structural alignment to identify proteins that are likely to bind ligands that are present in experimentally determined complexes. A scoring function that rapidly accounts for binding site similarities between the template and the proteins being scanned is a crucial feature of the method. The overall approach is first tested based on its ability to predict the residues on the surface of a protein that are likely to bind small-molecule ligands. The algorithm that we present, LBias, is shown to compare very favorably to existing algorithms for binding site residue prediction. LT-scanner's performance is evaluated based on its ability to identify known targets of Food and Drug Administration (FDA)-approved drugs and it too proves to be highly effective. The specificity of the scoring function that we use is demonstrated by the ability of LT-scanner to identify the known targets of FDA-approved kinase inhibitors based on templates involving other kinases. Combining sequence with structural information further improves LT-scanner performance. The approach we describe is extendable to the more general problem of identifying binding partners of known ligands even if they do not appear in a structurally determined complex, although this will require the integration of methods that combine protein structure and chemical compound databases.


Assuntos
Bases de Dados de Proteínas , Genoma , Inibidores de Proteínas Quinases/química , Proteínas , Ligantes , Proteínas/química , Proteínas/genética , Proteínas/metabolismo
9.
Science ; 358(6363): 623-630, 2017 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-29097544

RESUMO

Interfaces between organelles are emerging as critical platforms for many biological responses in eukaryotic cells. In yeast, the ERMES complex is an endoplasmic reticulum (ER)-mitochondria tether composed of four proteins, three of which contain a SMP (synaptotagmin-like mitochondrial-lipid binding protein) domain. No functional ortholog for any ERMES protein has been identified in metazoans. Here, we identified PDZD8 as an ER protein present at ER-mitochondria contacts. The SMP domain of PDZD8 is functionally orthologous to the SMP domain found in yeast Mmm1. PDZD8 was necessary for the formation of ER-mitochondria contacts in mammalian cells. In neurons, PDZD8 was required for calcium ion (Ca2+) uptake by mitochondria after synaptically induced Ca2+-release from ER and thereby regulated cytoplasmic Ca2+ dynamics. Thus, PDZD8 represents a critical ER-mitochondria tethering protein in metazoans. We suggest that ER-mitochondria coupling is involved in the regulation of dendritic Ca2+ dynamics in mammalian neurons.


Assuntos
Sinalização do Cálcio , Cálcio/metabolismo , Dendritos/metabolismo , Retículo Endoplasmático/metabolismo , Proteínas de Membrana/metabolismo , Mitocôndrias/metabolismo , Neurônios/metabolismo , Proteínas Adaptadoras de Transdução de Sinal , Animais , Teste de Complementação Genética , Células HEK293 , Humanos , Proteínas de Membrana/química , Proteínas de Membrana/genética , Camundongos , Domínios Proteicos , Receptores de Glutamato Metabotrópico/metabolismo , Receptores de N-Metil-D-Aspartato/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
10.
N Engl J Med ; 376(8): 742-754, 2017 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-28121514

RESUMO

BACKGROUND: The DiGeorge syndrome, the most common of the microdeletion syndromes, affects multiple organs, including the heart, the nervous system, and the kidney. It is caused by deletions on chromosome 22q11.2; the genetic driver of the kidney defects is unknown. METHODS: We conducted a genomewide search for structural variants in two cohorts: 2080 patients with congenital kidney and urinary tract anomalies and 22,094 controls. We performed exome and targeted resequencing in samples obtained from 586 additional patients with congenital kidney anomalies. We also carried out functional studies using zebrafish and mice. RESULTS: We identified heterozygous deletions of 22q11.2 in 1.1% of the patients with congenital kidney anomalies and in 0.01% of population controls (odds ratio, 81.5; P=4.5×10-14). We localized the main drivers of renal disease in the DiGeorge syndrome to a 370-kb region containing nine genes. In zebrafish embryos, an induced loss of function in snap29, aifm3, and crkl resulted in renal defects; the loss of crkl alone was sufficient to induce defects. Five of 586 patients with congenital urinary anomalies had newly identified, heterozygous protein-altering variants, including a premature termination codon, in CRKL. The inactivation of Crkl in the mouse model induced developmental defects similar to those observed in patients with congenital urinary anomalies. CONCLUSIONS: We identified a recurrent 370-kb deletion at the 22q11.2 locus as a driver of kidney defects in the DiGeorge syndrome and in sporadic congenital kidney and urinary tract anomalies. Of the nine genes at this locus, SNAP29, AIFM3, and CRKL appear to be critical to the phenotype, with haploinsufficiency of CRKL emerging as the main genetic driver. (Funded by the National Institutes of Health and others.).


Assuntos
Proteínas Adaptadoras de Transdução de Sinal/genética , Deleção Cromossômica , Síndrome de DiGeorge/genética , Haploinsuficiência , Rim/anormalidades , Proteínas Nucleares/genética , Sistema Urinário/anormalidades , Adolescente , Animais , Criança , Cromossomos Humanos Par 22 , Exoma , Feminino , Heterozigoto , Humanos , Lactente , Recém-Nascido , Masculino , Camundongos , Modelos Animais , Análise de Sequência de DNA , Adulto Jovem , Peixe-Zebra
11.
Elife ; 52016 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-27770567

RESUMO

We present a database, PrePPI (Predicting Protein-Protein Interactions), of more than 1.35 million predicted protein-protein interactions (PPIs). Of these at least 127,000 are expected to constitute direct physical interactions although the actual number may be much larger (~500,000). The current PrePPI, which contains predicted interactions for about 85% of the human proteome, is related to an earlier version but is based on additional sources of interaction evidence and is far larger in scope. The use of structural relationships allows PrePPI to infer numerous previously unreported interactions. PrePPI has been subjected to a series of validation tests including reproducing known interactions, recapitulating multi-protein complexes, analysis of disease associated SNPs, and identifying functional relationships between interacting proteins. We show, using Gene Set Enrichment Analysis (GSEA), that predicted interaction partners can be used to annotate a protein's function. We provide annotations for most human proteins, including many annotated as having unknown function.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Anotação de Sequência Molecular , Mapas de Interação de Proteínas , Proteoma , Humanos
12.
Neurogenetics ; 17(1): 43-9, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26576547

RESUMO

Protein phosphatase 2A (PP2A) is a heterotrimeric protein serine/threonine phosphatase and is involved in a broad range of cellular processes. PPP2R5D is a regulatory B subunit of PP2A and plays an important role in regulating key neuronal and developmental regulation processes such as PI3K/AKT and glycogen synthase kinase 3 beta (GSK3ß)-mediated cell growth, chromatin remodeling, and gene transcriptional regulation. Using whole-exome sequencing (WES), we identified four de novo variants in PPP2R5D in a total of seven unrelated individuals with intellectual disability (ID) and other shared clinical characteristics, including autism spectrum disorder, macrocephaly, hypotonia, seizures, and dysmorphic features. Among the four variants, two have been previously reported and two are novel. All four amino acids are highly conserved among the PP2A subunit family, and all change a negatively charged acidic glutamic acid (E) to a positively charged basic lysine (K) and are predicted to disrupt the PP2A subunit binding and impair the dephosphorylation capacity. Our data provides further support for PPP2R5D as a genetic cause of ID.


Assuntos
Transtorno Autístico/genética , Deficiência Intelectual/genética , Megalencefalia/genética , Hipotonia Muscular/genética , Mutação de Sentido Incorreto , Proteína Fosfatase 2/genética , Adolescente , Transtorno do Espectro Autista/epidemiologia , Transtorno do Espectro Autista/genética , Transtorno Autístico/epidemiologia , Criança , Pré-Escolar , Análise Mutacional de DNA , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Humanos , Lactente , Deficiência Intelectual/epidemiologia , Masculino , Megalencefalia/epidemiologia , Hipotonia Muscular/epidemiologia , Polimorfismo de Nucleotídeo Único
13.
Protein Sci ; 25(1): 159-65, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26178156

RESUMO

The growing structural coverage of proteomes is making structural comparison a powerful tool for function annotation. Such template-based approaches are based on the observation that structural similarity is often sufficient to infer similar function. However, it seems clear that, in addition to structural similarity, the specific characteristics of a given protein should also be taken into account in predicting function. Here we describe PredUs 2.0, a method to predict regions on a protein surface likely to bind other proteins, that is, interfacial residues. PredUs 2.0 is based on the PredUs method that is entirely template-based and uses known binding sites in structurally similar proteins to predict interfacial residues. PredUs 2.0 uses a Bayesian approach to combine the template-based scoring of PredUs with a score that reflects the propensities of individual amino acids to be in interfaces. PredUs 2.0 includes a novel protein size dependent metric to determine the number of residues that should be reported as interfacial. PredUs 2.0 significantly outperforms PredUs as well as other published interface prediction methods.


Assuntos
Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Proteínas/química , Aminoácidos/química , Biologia Computacional , Conformação Proteica
14.
PLoS Comput Biol ; 11(5): e1004248, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25938916

RESUMO

We describe a method to predict protein-protein interactions (PPIs) formed between structured domains and short peptide motifs. We take an integrative approach based on consensus patterns of known motifs in databases, structures of domain-motif complexes from the PDB and various sources of non-structural evidence. We combine this set of clues using a Bayesian classifier that reports the likelihood of an interaction and obtain significantly improved prediction performance when compared to individual sources of evidence and to previously reported algorithms. Our Bayesian approach was integrated into PrePPI, a structure-based PPI prediction method that, so far, has been limited to interactions formed between two structured domains. Around 80,000 new domain-motif mediated interactions were predicted, thus enhancing PrePPI's coverage of the human protein interactome.


Assuntos
Mapeamento de Interação de Proteínas/estatística & dados numéricos , Algoritmos , Teorema de Bayes , Biologia Computacional , Bases de Dados de Proteínas/estatística & dados numéricos , Genoma Humano , Humanos , Funções Verossimilhança , Modelos Biológicos , Domínios e Motivos de Interação entre Proteínas , Proteômica/estatística & dados numéricos , Máquina de Vetores de Suporte
15.
Curr Opin Struct Biol ; 32: 33-8, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25678152

RESUMO

We discuss recent approaches for structure-based protein function annotation. We focus on template-based methods where the function of a query protein is deduced from that of a template for which both the structure and function are known. We describe the different ways of identifying a template. These are typically based on sequence analysis but new methods based on purely structural similarity are also being developed that allow function annotation based on structural relationships that cannot be recognized by sequence. The growing number of available structures of known function, improved homology modeling techniques and new developments in the use of structure allow template-based methods to be applied on a proteome-wide scale and in many different biological contexts. This progress significantly expands the range of applicability of structural information in function annotation to a level that previously was only achievable by sequence comparison.


Assuntos
Proteínas/química , Proteínas/metabolismo , Animais , Humanos , Aprendizado de Máquina , Conformação Proteica , Homologia Estrutural de Proteína
16.
Annu Rev Biophys ; 43: 193-210, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24895853

RESUMO

The past decade has seen a dramatic expansion in the number and range of techniques available to obtain genome-wide information and to analyze this information so as to infer both the functions of individual molecules and how they interact to modulate the behavior of biological systems. Here, we review these techniques, focusing on the construction of physical protein-protein interaction networks, and highlighting approaches that incorporate protein structure, which is becoming an increasingly important component of systems-level computational techniques. We also discuss how network analyses are being applied to enhance our basic understanding of biological systems and their disregulation, as well as how these networks are being used in drug development.


Assuntos
Biologia Computacional/métodos , Mapas de Interação de Proteínas , Proteínas/química , Animais , Humanos , Proteínas/metabolismo
17.
Protein Sci ; 22(4): 359-66, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23349097

RESUMO

We outline a set of strategies to infer protein function from structure. The overall approach depends on extensive use of homology modeling, the exploitation of a wide range of global and local geometric relationships between protein structures and the use of machine learning techniques. The combination of modeling with broad searches of protein structure space defines a "structural BLAST" approach to infer function with high genomic coverage. Applications are described to the prediction of protein-protein and protein-ligand interactions. In the context of protein-protein interactions, our structure-based prediction algorithm, PrePPI, has comparable accuracy to high-throughput experiments. An essential feature of PrePPI involves the use of Bayesian methods to combine structure-derived information with non-structural evidence (e.g. co-expression) to assign a likelihood for each predicted interaction. This, combined with a structural BLAST approach significantly expands the range of applications of protein structure in the annotation of protein function, including systems level biological applications where it has previously played little role.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Algoritmos , Inteligência Artificial , Teorema de Bayes , Proteômica/métodos , Relação Estrutura-Atividade
18.
Nucleic Acids Res ; 41(Database issue): D828-33, 2013 01.
Artigo em Inglês | MEDLINE | ID: mdl-23193263

RESUMO

PrePPI (http://bhapp.c2b2.columbia.edu/PrePPI) is a database that combines predicted and experimentally determined protein-protein interactions (PPIs) using a Bayesian framework. Predicted interactions are assigned probabilities of being correct, which are derived from calculated likelihood ratios (LRs) by combining structural, functional, evolutionary and expression information, with the most important contribution coming from structure. Experimentally determined interactions are compiled from a set of public databases that manually collect PPIs from the literature and are also assigned LRs. A final probability is then assigned to every interaction by combining the LRs for both predicted and experimentally determined interactions. The current version of PrePPI contains ∼2 million PPIs that have a probability more than ∼0.1 of which ∼60 000 PPIs for yeast and ∼370 000 PPIs for human are considered high confidence (probability > 0.5). The PrePPI database constitutes an integrated resource that enables users to examine aggregate information on PPIs, including both known and potentially novel interactions, and that provides structural models for many of the PPIs.


Assuntos
Bases de Dados de Proteínas , Complexos Multiproteicos/química , Mapeamento de Interação de Proteínas , Teorema de Bayes , Humanos , Internet , Conformação Proteica , Interface Usuário-Computador
19.
Nature ; 490(7421): 556-60, 2012 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-23023127

RESUMO

The genome-wide identification of pairs of interacting proteins is an important step in the elucidation of cell regulatory mechanisms. Much of our present knowledge derives from high-throughput techniques such as the yeast two-hybrid assay and affinity purification, as well as from manual curation of experiments on individual systems. A variety of computational approaches based, for example, on sequence homology, gene co-expression and phylogenetic profiles, have also been developed for the genome-wide inference of protein-protein interactions (PPIs). Yet comparative studies suggest that the development of accurate and complete repertoires of PPIs is still in its early stages. Here we show that three-dimensional structural information can be used to predict PPIs with an accuracy and coverage that are superior to predictions based on non-structural evidence. Moreover, an algorithm, termed PrePPI, which combines structural information with other functional clues, is comparable in accuracy to high-throughput experiments, yielding over 30,000 high-confidence interactions for yeast and over 300,000 for human. Experimental tests of a number of predictions demonstrate the ability of the PrePPI algorithm to identify unexpected PPIs of considerable biological interest. The surprising effectiveness of three-dimensional structural information can be attributed to the use of homology models combined with the exploitation of both close and remote geometric relationships between proteins.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Proteômica/métodos , Animais , Teorema de Bayes , Encéfalo/metabolismo , Caderinas/metabolismo , Ensaios de Triagem em Larga Escala , Humanos , Proteínas de Ligação à Região de Interação com a Matriz/metabolismo , Camundongos , Modelos Moleculares , PPAR gama/metabolismo , Filogenia , Ligação Proteica , Conformação Proteica , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Proteoma/química , Proteoma/metabolismo , Curva ROC , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/metabolismo , Proteínas Supressoras da Sinalização de Citocina/metabolismo , Fatores de Transcrição/metabolismo
20.
J Struct Funct Genomics ; 13(3): 171-6, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22592539

RESUMO

Protein domain family PF11267 (DUF3067) is a family of proteins of unknown function found in both bacteria and eukaryotes. Here we present the solution NMR structure of the 102-residue Alr2454 protein from Nostoc sp. PCC 7120, which constitutes the first structural representative from this conserved protein domain family. The structure of Nostoc sp. Alr2454 adopts a novel protein fold.


Assuntos
Proteínas de Bactérias/química , Espectroscopia de Ressonância Magnética/métodos , Nostoc/química , Sequência de Aminoácidos , Proteínas de Bactérias/genética , Clonagem Molecular , Escherichia coli/química , Escherichia coli/genética , Genes Bacterianos , Dados de Sequência Molecular , Nostoc/genética , Conformação Proteica , Dobramento de Proteína , Estrutura Terciária de Proteína , Alinhamento de Sequência , Soluções/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...