Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Proteomics ; 23(17): e2200323, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37365936

RESUMO

Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.


Assuntos
Proteínas , Reprodutibilidade dos Testes , Proteínas/metabolismo , Ligação Proteica
2.
J Comput Chem ; 44(13): 1236-1249, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-36999748

RESUMO

Designing movesets providing high quality protein conformations remains a hard problem, especially when it comes to deform a long protein backbone segment, and a key building block to do so is the so-called tripeptide loop closure (TLC). Consider a tripeptide whose first and last bonds ( N 1 C α ; 1 and C α ; 3 C 3 ) are fixed, and so are all internal coordinates except the six ϕ ψ i = 1,2,3 dihedral angles associated to the three C α carbons. Under these conditions, the TLC algorithm provides all possible values for these six dihedral angles-there exists at most 16 solutions. TLC moves atoms up to ∼ 5 Å in one step and retains low energy conformations, whence its pivotal role to design move sets sampling protein loop conformations. In this work, we relax the previous constraints, allowing the last bond ( C α ; 3 C 3 ) to freely move in 3D space-or equivalently in a 5D configuration space. We exhibit necessary geometric constraints in this 5D space for TLC to admit solutions. Our analysis provides key insights on the geometry of solutions for TLC. Most importantly, when using TLC to sample loop conformations based on m consecutive tripeptides along a protein backbone, we obtain an exponential gain in the volume of the 5 m -dimensional configuration space to be explored.


Assuntos
Algoritmos , Modelos Moleculares , Conformação Proteica
3.
J Comput Chem ; 44(11): 1094-1104, 2023 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-36733189

RESUMO

Flexible loops are paramount to protein functions, with action modes ranging from localized dynamics contributing to the free energy of the system, to large amplitude conformational changes accounting for the repositioning whole secondary structure elements or protein domains. However, generating diverse and low energy loops remains a difficult problem. This work introduces a novel paradigm to sample loop conformations, in the spirit of the hit-and-run (HAR) Markov chain Monte Carlo technique. The algorithm uses a decomposition of the loop into tripeptides, and a novel characterization of necessary conditions for Tripeptide Loop Closure to admit solutions. Denoting m the number of tripeptides, the algorithm works in an angular space of dimension 12 m. In this space, the hyper-surfaces associated with the aforementioned necessary conditions are used to run a HAR-like sampling technique. On classical loop cases up to 15 amino acids, our parameter free method compares favorably to previous work, generating more diverse conformational ensembles. We also report experiments on a 30 amino acids long loop, a size not processed in any previous work.


Assuntos
Aminoácidos , Proteínas , Modelos Moleculares , Proteínas/química , Estrutura Secundária de Proteína , Aminoácidos/química , Algoritmos , Conformação Proteica
4.
PLoS One ; 17(11): e0268956, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36342924

RESUMO

Prioritizing genes for their role in drug sensitivity, is an important step in understanding drugs mechanisms of action and discovering new molecular targets for co-treatment. To formalize this problem, we consider two sets of genes X and P respectively composing the gene signature of cell sensitivity at the drug IC50 and the genes involved in its mechanism of action, as well as a protein interaction network (PPIN) containing the products of X and P as nodes. We introduce Genetrank, a method to prioritize the genes in X for their likelihood to regulate the genes in P. Genetrank uses asymmetric random walks with restarts, absorbing states, and a suitable renormalization scheme. Using novel so-called saturation indices, we show that the conjunction of absorbing states and renormalization yields an exploration of the PPIN which is much more progressive than that afforded by random walks with restarts only. Using MINT as underlying network, we apply Genetrank to a predictive gene signature of cancer cells sensitivity to tumor-necrosis-factor-related apoptosis-inducing ligand (TRAIL), performed in single-cells. Our ranking provides biological insights on drug sensitivity and a gene set considerably enriched in genes regulating TRAIL pharmacodynamics when compared to the most significant differentially expressed genes obtained from a statistical analysis framework alone. We also introduce gene expression radars, a visualization tool embedded in MA plots to assess all pairwise interactions at a glance on graphical representations of transcriptomics data. Genetrank is made available in the Structural Bioinformatics Library (https://sbl.inria.fr/doc/Genetrank-user-manual.html). It should prove useful for mining gene sets in conjunction with a signaling pathway, whenever other approaches yield relatively large sets of genes.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Biologia Computacional/métodos , Mapas de Interação de Proteínas , Ligante Indutor de Apoptose Relacionado a TNF/genética
6.
Proteins ; 90(3): 858-868, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34783395

RESUMO

Tripeptide loop closure (TLC) is a standard procedure to reconstruct protein backbone conformations, by solving a zero-dimensional polynomial system yielding up to 16 solutions. In this work, we first show that multiprecision is required in a TLC solver to guarantee the existence and the accuracy of solutions. We then compare solutions yielded by the TLC solver against tripeptides from the Protein Data Bank. We show that these solutions are geometrically diverse (up to 3Å Root mean square deviation with respect to the data) and sound in terms of potential energy. Finally, we compare Ramachandran distributions of data and reconstructions for the three amino acids. The distribution of reconstructions in the second angular space ϕ2ψ2 stands out, with a rather uniform distribution leaving a central void. We anticipate that these insights, coupled to our robust implementation in the Structural Bioinformatics Library ( https://sbl.inria.fr/doc/Tripeptide_loop_closure-user-manual.html), will help understanding the properties of TLC reconstructions, with potential applications to the generation of conformations of flexible loops in particular.


Assuntos
Oligopeptídeos/química , Algoritmos , Sequência de Aminoácidos , Biologia Computacional , Bases de Dados de Proteínas , Modelos Moleculares , Conformação Proteica , Relação Estrutura-Atividade
7.
Proteins ; 90(3): 848-857, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34779026

RESUMO

We introduce multiple interface string alignment (MISA), a visualization tool to display coherently various sequence and structure based statistics at protein-protein interfaces (SSE elements, buried surface area, ΔASA , B factor values, etc). The amino acids supporting these annotations are obtained from Voronoi interface models. The benefit of MISA is to collate annotated sequences of (homologous) chains found in different biological contexts, that is, bound with different partners or unbound. The aggregated views MISA/SSE, MISA/BSA, MISA/ΔASA, and so forth, make it trivial to identify commonalities and differences between chains, to infer key interface residues, and to understand where conformational changes occur upon binding. As such, they should prove of key relevance for knowledge-based annotations of protein databases such as the Protein Data Bank. Illustrations are provided on the receptor binding domain of coronaviruses, in complex with their cognate partner or (neutralizing) antibodies. MISA computed with a minimal number of structures complement and enrich findings previously reported. The corresponding package is available from the Structural Bioinformatics Library (http://sbl.inria.frand https://sbl.inria.fr/doc/Multiple_interface_string_alignment-user-manual.html).


Assuntos
Coronavirus/química , Glicoproteína da Espícula de Coronavírus/química , Sequência de Aminoácidos , Biologia Computacional , Bases de Dados de Proteínas , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Análise de Sequência de Proteína , Interface Usuário-Computador
8.
Proteins ; 89(3): 259-275, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-32960482

RESUMO

Resistance-nodulation-cell division family proteins are transmembrane proteins identified as large spectrum drug transporters involved in multidrug resistance. A prototypical case in this superfamily, responsible for antibiotic resistance in selected gram-negative bacteria, is AcrB. AcrB forms a trimer using the proton motive force to efflux drugs, implementing a functional rotation mechanism. Unfortunately, the size of the system (1049 amino acid per monomer and membrane) has prevented a systematic dynamical exploration, so that the mild understanding of this coupled transport jeopardizes our ability to counter it. The large number of crystal structures of AcrB prompts studies to further our understanding of the mechanism. To this end, we present a novel strategy based on two key ingredients, which are to study dynamics by exploiting information embodied in the numerous crystal structures obtained to date, and to systematically consider subdomains, their dynamics, and their interactions. Along the way, we identify the subdomains responsible for dynamic events, refine the states (A, B, E) of the functional rotation mechanism, and analyze the evolution of intramonomer and intermonomer interfaces along the functional cycle. Our analysis shows the relevance of AcrB's efflux mechanism as a template within the HAE1 family but not beyond. It also paves the way to targeted simulations exploiting the most relevant degrees of freedom at certain steps, and to a targeting of specific interfaces to block the drug efflux. Our work shows that complex dynamics can be unveiled from static snapshots, a strategy that may be used on a variety of molecular machines of large size.


Assuntos
Proteínas de Escherichia coli , Proteínas Associadas à Resistência a Múltiplos Medicamentos , Sítio Alostérico , Antibacterianos/química , Antibacterianos/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Simulação de Dinâmica Molecular , Proteínas Associadas à Resistência a Múltiplos Medicamentos/química , Proteínas Associadas à Resistência a Múltiplos Medicamentos/metabolismo , Ligação Proteica , Conformação Proteica
9.
Proteins ; 87(5): 380-389, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30663799

RESUMO

The root mean square deviation (RMSD) and the least RMSD are two widely used similarity measures in structural bioinformatics. Yet, they stem from global comparisons, possibly obliterating locally conserved motifs. We correct these limitations with the so-called combined RMSD, which mixes independent lRMSD measures, each computed with its own rigid motion. The combined RMSD is relevant in two main scenarios, namely to compare (quaternary) structures based on motifs defined from the sequence (domains and SSE) and to compare structures based on structural motifs yielded by local structural alignment methods. We illustrate the benefits of combined RMSD over the usual RMSD on three problems, namely (a) the assignment of quaternary structures for hemoglobin (scenario #1), (b) the calculation of structural phylogenies (case study: class II fusion proteins; scenario #1), and (c) the analysis of conformational changes based on combined RMSD of rigid structural motifs (case study: one class II fusion protein; scenario #2). Based on these illustrations, we argue that the combined RMSD is a tool of choice to perform positive and negative discrimination of degree of freedom, with applications to the design of move sets and collective coordinates. Executables to compute combined RMSD are available within the Structural Bioinformatics Library (http://sbl.inria.fr).


Assuntos
Biologia Computacional/estatística & dados numéricos , Estrutura Quaternária de Proteína , Proteínas/química , Algoritmos , Motivos de Aminoácidos/genética , Sequência de Aminoácidos , Análise dos Mínimos Quadrados , Conformação Proteica , Alinhamento de Sequência/estatística & dados numéricos
10.
Front Immunol ; 9: 2115, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30319606

RESUMO

Vaccination induces "public" antibody clonotypes common to all individuals of a species, that may mediate universal protection against pathogens. Only few studies tried to trace back the origin of these public B-cell clones. Here we used Illumina sequencing and computational modeling to unveil the mechanisms shaping the structure of the fish memory antibody response against an attenuated Viral Hemorrhagic Septicemia rhabdovirus. After vaccination, a persistent memory response with a public VH5JH5 IgM component was composed of dominant antibodies shared among all individuals. The rearrangement model showed that these public junctions occurred with high probability indicating that they were already favored before vaccination due to the recombination process, as shown in mammals. In addition, these clonotypes were in the naïve repertoire associated with larger similarity classes, composed of junctions differing only at one or two positions by amino acids with comparable properties. The model showed that this property was due to selective processes exerted between the recombination and the naive repertoire. Finally, our results showed that public clonotypes greatly expanded after vaccination displayed several VDJ junctions differing only by one or two amino acids with similar properties, highlighting a convergent response. The fish public memory antibody response to a virus is therefore shaped at three levels: by recombination biases, by selection acting on the formation of the pre-vaccination repertoire, and by convergent selection of functionally similar clonotypes during the response. We also show that naive repertoires of IgM and IgT have different structures and sharing between individuals, due to selection biases. In sum, our comparative approach identifies three conserved features of the antibody repertoire associated with public memory responses. These features were already present in the last common ancestors of fish and mammals, while other characteristics may represent species-specific solutions.


Assuntos
Linfócitos B/imunologia , Peixes/imunologia , Septicemia Hemorrágica Viral/prevenção & controle , Novirhabdovirus/imunologia , Vacinas Virais/imunologia , Animais , Linfócitos B/metabolismo , Células Clonais/imunologia , Células Clonais/metabolismo , Septicemia Hemorrágica Viral/imunologia , Septicemia Hemorrágica Viral/virologia , Imunoglobulina M/genética , Imunoglobulina M/imunologia , Imunoglobulina M/metabolismo , Memória Imunológica/imunologia , Recombinação V(D)J/imunologia , Vacinação , Vacinas Virais/administração & dosagem
11.
Eur J Immunol ; 48(1): 194-203, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28850672

RESUMO

Rheumatoid arthritis (RA) is associated with abnormal B cell-functions implicating antibody-dependent and -independent mechanisms. B cells have emerged as important cytokine-producing cells, and cytokines are well-known drivers of RA pathogenesis. To identify novel cytokine-mediated B-cell functions in RA, we comprehensively analysed the capacity of B cells from RA patients with an inadequate response to disease modifying anti-rheumatic drugs to produce cytokines in comparison with healthy donors (HD). RA B cells displayed a constitutively higher production of the pathogenic factors interleukin (IL)-8 and Gro-α, while their production of several cytokines upon activation via the B cell receptor for antigen (BCR) was broadly suppressed, including a loss of the expression of the protective factor TRAIL, compared to HD B cells. These defects were partly erased after treatment with the IL-6-signalling inhibitor tocilizumab, indicating that abnormal IL-6 signalling contributed to these abnormalities. Noteworthy, the clinical response of individual patients to tocilizumab therapy could be predicted using the amounts of MIP-1ß and ß-NGF produced by these patients' B cells before treatment. Taken together, our study highlights hitherto unknown abnormal B-cell functions in RA patients, which are related to the unbalanced cytokine network, and are potentially relevant for RA pathogenesis and treatment.


Assuntos
Anticorpos Monoclonais Humanizados/farmacologia , Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/patologia , Linfócitos B/imunologia , Interleucina-6/antagonistas & inibidores , Interleucina-6/metabolismo , Artrite Reumatoide/imunologia , Quimiocina CCL4/biossíntese , Quimiocina CXCL1/biossíntese , Humanos , Interleucina-8/biossíntese , Fator de Crescimento Neural/biossíntese , Ligante Indutor de Apoptose Relacionado a TNF/biossíntese
12.
Front Immunol ; 8: 34, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28232828

RESUMO

Antibody-antigen complexes challenge our understanding, as analyses to date failed to unveil the key determinants of binding affinity and interaction specificity. We partially fill this gap based on novel quantitative analyses using two standardized databases, the IMGT/3Dstructure-DB and the structure affinity benchmark. First, we introduce a statistical analysis of interfaces which enables the classification of ligand types (protein, peptide, and chemical; cross-validated classification error of 9.6%) and yield binding affinity predictions of unprecedented accuracy (median absolute error of 0.878 kcal/mol). Second, we exploit the contributions made by CDRs in terms of position at the interface and atomic packing properties to show that in general, VH CDR3 and VL CDR3 make dominant contributions to the binding affinity, a fact also shown to be consistent with the enthalpy-entropy compensation associated with preconfiguration of CDR3. Our work suggests that the affinity prediction problem could be partially solved from databases of high resolution crystal structures of complexes with known affinity.

13.
Bioinformatics ; 33(7): 997-1004, 2017 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-28062450

RESUMO

Motivation: Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. Results: The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. Availability and Implementation: The SBL is available from http://sbl.inria.fr. Contact: Frederic.Cazals@inria.fr. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Biblioteca Gênica , Modelos Moleculares , Algoritmos , Licenciamento , Software
14.
J Comput Aided Mol Des ; 30(9): 791-804, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27718029

RESUMO

The 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it.


Assuntos
Proteínas de Choque Térmico HSP90/química , Simulação de Acoplamento Molecular/métodos , Sítios de Ligação , Bases de Dados de Proteínas , Desenho de Fármacos , Entropia , Humanos , Ligantes , Estudos Prospectivos , Ligação Proteica , Conformação Proteica , Análise de Regressão , Relação Estrutura-Atividade , Termodinâmica
15.
J Chem Phys ; 144(5): 054109, 2016 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-26851910

RESUMO

We consider a coarse-graining of high-dimensional potential energy landscapes based upon persistences, which correspond to lowest barrier heights to lower-energy minima. Persistences can be calculated efficiently for local minima in kinetic transition networks that are based on stationary points of the prevailing energy landscape. The networks studied here represent peptides, proteins, nucleic acids, an atomic cluster, and a glassy system. Minima with high persistence values are likely to represent some form of alternative structural morphology, which, if appreciably populated at the prevailing temperature, could compete with the global minimum (defined as infinitely persistent). Threshold values on persistences (and in some cases equilibrium occupation probabilities) have therefore been used in this work to select subsets of minima, which were then analysed to see how well they can represent features of the full network. Simplified disconnectivity graphs showing only the selected minima can convey the funnelling (including any multiple-funnel) characteristics of the corresponding full graphs. The effect of the choice of persistence threshold on the reduced disconnectivity graphs was considered for a system with a hierarchical, glassy landscape. Sets of persistent minima were also found to be useful in comparing networks for the same system sampled under different conditions, using minimum oriented spanning forests.

16.
Proteins ; 84(1): 9-20, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26471944

RESUMO

Predicting protein binding affinities from structural data has remained elusive, a difficulty owing to the variety of protein binding modes. Using the structure-affinity-benchmark (SAB, 144 cases with bound/unbound crystal structures and experimental affinity measurements), prediction has been undertaken either by fitting a model using a handfull of predefined variables, or by training a complex model from a large pool of parameters (typically hundreds). The former route unnecessarily restricts the model space, while the latter is prone to overfitting. We design models in a third tier, using 12 variables describing enthalpic and entropic variations upon binding, and a model selection procedure identifying the best sparse model built from a subset of these variables. Using these models, we report three main results. First, we present models yielding a marked improvement of affinity predictions. For the whole dataset, we present a model predicting Kd within 1 and 2 orders of magnitude for 48% and 79% of cases, respectively. These statistics jump to 62% and 89% respectively, for the subset of the SAB consisting of high resolution structures. Second, we show that these performances owe to a new parameter encoding interface morphology and packing properties of interface atoms. Third, we argue that interface flexibility and prediction hardness do not correlate, and that for flexible cases, a performance matching that of the whole SAB can be achieved. Overall, our work suggests that the affinity prediction problem could be partly solved using databases of high resolution complexes whose affinity is known.


Assuntos
Proteínas/química , Proteínas/metabolismo , Animais , Cristalografia por Raios X , Bases de Dados de Proteínas , Humanos , Modelos Biológicos , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Termodinâmica
17.
J Comput Chem ; 37(8): 739-52, 2016 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-26714673

RESUMO

The number of local minima of the potential energy landscape (PEL) of molecular systems generally grows exponentially with the number of degrees of freedom, so that a crucial property of PEL exploration algorithms is their ability to identify local minima, which are low lying and diverse. In this work, we present a new exploration algorithm, retaining the ability of basin hopping (BH) to identify local minima, and that of transition based rapidly exploring random trees (T-RRT) to foster the exploration of yet unexplored regions. This ability is obtained by interleaving calls to the extension procedures of BH and T-RRT, and we show tuning the balance between these two types of calls allows the algorithm to focus on low lying regions. Computational efficiency is obtained using state-of-the art data structures, in particular for searching approximate nearest neighbors in metric spaces. We present results for the BLN69, a protein model whose conformational space has dimension 207 and whose PEL has been studied exhaustively. On this system, we show that the propensity of our algorithm to explore low lying regions of the landscape significantly outperforms those of BH and T-RRT.


Assuntos
Algoritmos , Proteínas/química , Inteligência Artificial , Biologia Computacional , Conformação Proteica , Termodinâmica
18.
J Comput Chem ; 36(16): 1213-31, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25994596

RESUMO

We present novel algorithms and software addressing four core problems in computational structural biology, namely analyzing a conformational ensemble, comparing two conformational ensembles, analyzing a sampled energy landscape, and comparing two sampled energy landscapes. Using recent developments in computational topology, graph theory, and combinatorial optimization, we make two notable contributions. First, we present a generic algorithm analyzing height fields. We then use this algorithm to perform density-based clustering of conformations, and to analyze a sampled energy landscape in terms of basins and transitions between them. In both cases, topological persistence is used to manage (geometric) frustration. Second, we introduce two algorithms to compare transition graphs. The first is the classical earth mover distance metric which depends only on local minimum energy configurations along with their statistical weights, while the second incorporates topological constraints inherent to conformational transitions. Illustrations are provided on a simplified protein model (BLN69), whose frustrated potential energy landscape has been thoroughly studied. The software implementing our tools is also made available, and should prove valuable wherever conformational ensembles and energy landscapes are used.


Assuntos
Algoritmos , Proteínas/química , Termodinâmica , Modelos Moleculares , Conformação Molecular , Conformação Proteica , Software
19.
Mol Cell Proteomics ; 14(8): 2274-84, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25850436

RESUMO

Consider a set of oligomers listing the subunits involved in subcomplexes of a macromolecular assembly, obtained e.g. using native mass spectrometry or affinity purification. Given these oligomers, connectivity inference (CI) consists of finding the most plausible contacts between these subunits, and minimum connectivity inference (MCI) is the variant consisting of finding a set of contacts of smallest cardinality. MCI problems avoid speculating on the total number of contacts but yield a subset of all contacts and do not allow exploiting a priori information on the likelihood of individual contacts. In this context, we present two novel algorithms, MILP-W and MILP-WB. The former solves the minimum weight connectivity inference (MWCI), an optimization problem whose criterion mixes the number of contacts and their likelihood. The latter uses the former in a bootstrap fashion to improve the sensitivity and the specificity of solution sets.Experiments on three systems (yeast exosome, yeast proteasome lid, human eIF3), for which reference contacts are known (crystal structure, cryo electron microscopy, cross-linking), show that our algorithms predict contacts with high specificity and sensitivity, yielding a very significant improvement over previous work, typically a twofold increase in sensitivity.The software accompanying this paper is made available and should prove of ubiquitous interest whenever connectivity inference from oligomers is faced.


Assuntos
Algoritmos , Substâncias Macromoleculares/metabolismo , Fator de Iniciação 3 em Eucariotos/metabolismo , Exossomos/metabolismo , Humanos , Modelos Teóricos , Complexo de Endopeptidases do Proteassoma/metabolismo , Subunidades Proteicas/metabolismo , Saccharomyces cerevisiae/metabolismo
20.
Proteins ; 81(11): 2034-44, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23609891

RESUMO

Reconstruction by data integration is an emerging trend to reconstruct large protein assemblies, but uncertainties on the input data yield average models whose quantitative interpretation is challenging. This article presents methods to probe fuzzy models of large assemblies against atomic resolution models of subsystems. Consider a toleranced model (TOM) of a macromolecular assembly, namely a continuum of nested shapes representing the assembly at multiple scales. Also consider a template namely an atomic resolution 3D model of a subsystem (a complex) of this assembly. We present graph-based algorithms performing a multi-scale assessment of the complexes of the TOM, by comparing the pairwise contacts which appear in the TOM against those of the template. We apply this machinery on TOM derived from an average model of the nuclear pore complex, to explore the connections among members of its well-characterized Y-complex.


Assuntos
Proteínas/química , Proteínas/metabolismo , Algoritmos , Substâncias Macromoleculares , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...