Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Nat Methods ; 21(1): 110-116, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38036854

ABSTRACT

Artificial intelligence-based protein structure prediction methods such as AlphaFold have revolutionized structural biology. The accuracies of these predictions vary, however, and they do not take into account ligands, covalent modifications or other environmental factors. Here, we evaluate how well AlphaFold predictions can be expected to describe the structure of a protein by comparing predictions directly with experimental crystallographic maps. In many cases, AlphaFold predictions matched experimental maps remarkably closely. In other cases, even very high-confidence predictions differed from experimental maps on a global scale through distortion and domain orientation, and on a local scale in backbone and side-chain conformation. We suggest considering AlphaFold predictions as exceptionally useful hypotheses. We further suggest that it is important to consider the confidence in prediction when interpreting AlphaFold predictions and to carry out experimental structure determination to verify structural details, particularly those that involve interactions not included in the prediction.


Subject(s)
Artificial Intelligence , Mental Processes , Crystallography , Protein Conformation
2.
Acta Crystallogr D Struct Biol ; 79(Pt 3): 234-244, 2023 Mar 01.
Article in English | MEDLINE | ID: mdl-36876433

ABSTRACT

Experimental structure determination can be accelerated with artificial intelligence (AI)-based structure-prediction methods such as AlphaFold. Here, an automatic procedure requiring only sequence information and crystallographic data is presented that uses AlphaFold predictions to produce an electron-density map and a structural model. Iterating through cycles of structure prediction is a key element of this procedure: a predicted model rebuilt in one cycle is used as a template for prediction in the next cycle. This procedure was applied to X-ray data for 215 structures released by the Protein Data Bank in a recent six-month period. In 87% of cases our procedure yielded a model with at least 50% of Cα atoms matching those in the deposited models within 2 Å. Predictions from the iterative template-guided prediction procedure were more accurate than those obtained without templates. It is concluded that AlphaFold predictions obtained based on sequence information alone are usually accurate enough to solve the crystallographic phase problem with molecular replacement, and a general strategy for macromolecular structure determination that includes AI-based prediction both as a starting point and as a method of model optimization is suggested.


Subject(s)
Artificial Intelligence , Crystallography , Databases, Protein , Models, Structural
3.
Acta Crystallogr D Struct Biol ; 78(Pt 11): 1303-1314, 2022 Nov 01.
Article in English | MEDLINE | ID: mdl-36322415

ABSTRACT

AlphaFold has recently become an important tool in providing models for experimental structure determination by X-ray crystallography and cryo-EM. Large parts of the predicted models typically approach the accuracy of experimentally determined structures, although there are frequently local errors and errors in the relative orientations of domains. Importantly, residues in the model of a protein predicted by AlphaFold are tagged with a predicted local distance difference test score, informing users about which regions of the structure are predicted with less confidence. AlphaFold also produces a predicted aligned error matrix indicating its confidence in the relative positions of each pair of residues in the predicted model. The phenix.process_predicted_model tool downweights or removes low-confidence residues and can break a model into confidently predicted domains in preparation for molecular replacement or cryo-EM docking. These confidence metrics are further used in ISOLDE to weight torsion and atom-atom distance restraints, allowing the complete AlphaFold model to be interactively rearranged to match the docked fragments and reducing the need for the rebuilding of connecting regions.


Subject(s)
Software , Models, Molecular , Crystallography, X-Ray , Protein Conformation , Cryoelectron Microscopy
4.
Acta Crystallogr D Struct Biol ; 77(Pt 1): 1-10, 2021 Jan 01.
Article in English | MEDLINE | ID: mdl-33404520

ABSTRACT

Crystallographic phasing strategies increasingly require the exploration and ranking of many hypotheses about the number, types and positions of atoms, molecules and/or molecular fragments in the unit cell, each with only a small chance of being correct. Accelerating this move has been improvements in phasing methods, which are now able to extract phase information from the placement of very small fragments of structure, from weak experimental phasing signal or from combinations of molecular replacement and experimental phasing information. Describing phasing in terms of a directed acyclic graph allows graph-management software to track and manage the path to structure solution. The crystallographic software supporting the graph data structure must be strictly modular so that nodes in the graph are efficiently generated by the encapsulated functionality. To this end, the development of new software, Phasertng, which uses directed acyclic graphs natively for input/output, has been initiated. In Phasertng, the codebase of Phaser has been rebuilt, with an emphasis on modularity, on scripting, on speed and on continuing algorithm development. As a first application of phasertng, its advantages are demonstrated in the context of phasertng.xtricorder, a tool to analyse and triage merged data in preparation for molecular replacement or experimental phasing. The description of the phasing strategy with directed acyclic graphs is a generalization that extends beyond the functionality of Phasertng, as it can incorporate results from bioinformatics and other crystallographic tools, and will facilitate multifaceted search strategies, dynamic ranking of alternative search pathways and the exploitation of machine learning to further improve phasing strategies.


Subject(s)
Crystallography, X-Ray , Software , Algorithms , Machine Learning , Proteins/chemistry
5.
Acta Crystallogr D Struct Biol ; 76(Pt 3): 238-247, 2020 Mar 01.
Article in English | MEDLINE | ID: mdl-32133988

ABSTRACT

The information gained by making a measurement, termed the Kullback-Leibler divergence, assesses how much more precisely the true quantity is known after the measurement was made (the posterior probability distribution) than before (the prior probability distribution). It provides an upper bound for the contribution that an observation can make to the total likelihood score in likelihood-based crystallographic algorithms. This makes information gain a natural criterion for deciding which data can legitimately be omitted from likelihood calculations. Many existing methods use an approximation for the effects of measurement error that breaks down for very weak and poorly measured data. For such methods a different (higher) information threshold is appropriate compared with methods that account well for even large measurement errors. Concerns are raised about a current trend to deposit data that have been corrected for anisotropy, sharpened and pruned without including the original unaltered measurements. If not checked, this trend will have serious consequences for the reuse of deposited data by those who hope to repeat calculations using improved new methods.


Subject(s)
Algorithms , X-Ray Diffraction/methods , Anisotropy , Likelihood Functions
6.
Acta Crystallogr D Struct Biol ; 76(Pt 1): 19-27, 2020 Jan 01.
Article in English | MEDLINE | ID: mdl-31909740

ABSTRACT

Good prior estimates of the effective root-mean-square deviation (r.m.s.d.) between the atomic coordinates of the model and the target optimize the signal in molecular replacement, thereby increasing the success rate in difficult cases. Previous studies using protein structures solved by X-ray crystallography as models showed that optimal error estimates (refined after structure solution) were correlated with the sequence identity between the model and target, and with the number of residues in the model. Here, this work has been extended to find additional correlations between parameters of the model and the target and hence improved prior estimates of the coordinate error. Using a graph database, a curated set of 6030 molecular-replacement calculations using models that had been solved by X-ray crystallography was analysed to consider about 120 model and target parameters. Improved estimates were achieved by replacing the sequence identity with the Gonnet score for sequence similarity, as well as by considering the resolution of the target structure and the MolProbity score of the model. This approach was extended by analysing 12 610 additional molecular-replacement calculations where the model was determined by NMR. The median r.m.s.d. between pairs of models in an ensemble was found to be correlated with the estimated r.m.s.d. to the target. For models solved by NMR, the overall coordinate error estimates were larger than for structures determined by X-ray crystallography, and were more highly correlated with the number of residues.


Subject(s)
Crystallography, X-Ray/methods , Models, Molecular , Protein Conformation , Proteins/chemistry , Magnetic Resonance Spectroscopy
7.
Acta Crystallogr D Struct Biol ; 75(Pt 10): 861-877, 2019 Oct 01.
Article in English | MEDLINE | ID: mdl-31588918

ABSTRACT

Diffraction (X-ray, neutron and electron) and electron cryo-microscopy are powerful methods to determine three-dimensional macromolecular structures, which are required to understand biological processes and to develop new therapeutics against diseases. The overall structure-solution workflow is similar for these techniques, but nuances exist because the properties of the reduced experimental data are different. Software tools for structure determination should therefore be tailored for each method. Phenix is a comprehensive software package for macromolecular structure determination that handles data from any of these techniques. Tasks performed with Phenix include data-quality assessment, map improvement, model building, the validation/rebuilding/refinement cycle and deposition. Each tool caters to the type of experimental data. The design of Phenix emphasizes the automation of procedures, where possible, to minimize repetitive and time-consuming manual tasks, while default parameters are chosen to encourage best practice. A graphical user interface provides access to many command-line features of Phenix and streamlines the transition between programs, project tracking and re-running of previous tasks.


Subject(s)
Automation/methods , Macromolecular Substances/chemistry , Software Design , Software Validation , Cryoelectron Microscopy/methods , Crystallography, X-Ray/methods , Models, Molecular , Molecular Conformation
8.
Acta Crystallogr D Struct Biol ; 74(Pt 4): 245-255, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29652252

ABSTRACT

Molecular-replacement phasing of macromolecular crystal structures is often fast, but if a molecular-replacement solution is not immediately obtained the crystallographer must judge whether to pursue molecular replacement or to attempt experimental phasing as the quickest path to structure solution. The introduction of the expected log-likelihood gain [eLLG; McCoy et al. (2017), Proc. Natl Acad. Sci. USA, 114, 3637-3641] has given the crystallographer a powerful new tool to aid in making this decision. The eLLG is the log-likelihood gain on intensity [LLGI; Read & McCoy (2016), Acta Cryst. D72, 375-387] expected from a correctly placed model. It is calculated as a sum over the reflections of a function dependent on the fraction of the scattering for which the model accounts, the estimated model coordinate error and the measurement errors in the data. It is shown how the eLLG may be used to answer the question `can I solve my structure by molecular replacement?'. However, this is only the most obvious of the applications of the eLLG. It is also discussed how the eLLG may be used to determine the search order and minimal data requirements for obtaining a molecular-replacement solution using a given model, and for decision making in fragment-based molecular replacement, single-atom molecular replacement and likelihood-guided model pruning.


Subject(s)
Crystallography, X-Ray/methods , Likelihood Functions , Models, Molecular , Decision Making , Eukaryotic Initiation Factor-2/chemistry , Humans , Protein Conformation , Protein Domains
9.
Acta Crystallogr D Struct Biol ; 74(Pt 4): 279-289, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29652255

ABSTRACT

Descriptions are given of the maximum-likelihood gyre method implemented in Phaser for optimizing the orientation and relative position of rigid-body fragments of a model after the orientation of the model has been identified, but before the model has been positioned in the unit cell, and also the related gimble method for the refinement of rigid-body fragments of the model after positioning. Gyre refinement helps to lower the root-mean-square atomic displacements between model and target molecular-replacement solutions for the test case of antibody Fab(26-10) and improves structure solution with ARCIMBOLDO_SHREDDER.


Subject(s)
Crystallography, X-Ray/methods , Immunoglobulin Fab Fragments/chemistry , Likelihood Functions , Models, Molecular , Databases, Protein , Humans , Protein Conformation , Rotation , Software
10.
Acta Crystallogr D Struct Biol ; 74(Pt 4): 290-304, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29652256

ABSTRACT

Macromolecular structures can be solved by molecular replacement provided that suitable search models are available. Models from distant homologues may deviate too much from the target structure to succeed, notwithstanding an overall similar fold or even their featuring areas of very close geometry. Successful methods to make the most of such templates usually rely on the degree of conservation to select and improve search models. ARCIMBOLDO_SHREDDER uses fragments derived from distant homologues in a brute-force approach driven by the experimental data, instead of by sequence similarity. The new algorithms implemented in ARCIMBOLDO_SHREDDER are described in detail, illustrating its characteristic aspects in the solution of new and test structures. In an advance from the previously published algorithm, which was based on omitting or extracting contiguous polypeptide spans, model generation now uses three-dimensional volumes respecting structural units. The optimal fragment size is estimated from the expected log-likelihood gain (LLG) values computed assuming that a substructure can be found with a level of accuracy near that required for successful extension of the structure, typically below 0.6 Šroot-mean-square deviation (r.m.s.d.) from the target. Better sampling is attempted through model trimming or decomposition into rigid groups and optimization through Phaser's gyre refinement. Also, after model translation, packing filtering and refinement, models are either disassembled into predetermined rigid groups and refined (gimble refinement) or Phaser's LLG-guided pruning is used to trim the model of residues that are not contributing signal to the LLG at the target r.m.s.d. value. Phase combination among consistent partial solutions is performed in reciprocal space with ALIXE. Finally, density modification and main-chain autotracing in SHELXE serve to expand to the full structure and identify successful solutions. The performance on test data and the solution of new structures are described.


Subject(s)
Algorithms , Macromolecular Substances/chemistry , Models, Molecular , Structural Homology, Protein , Bacterial Proteins/chemistry , Computer Simulation , Crystallography, X-Ray
11.
Proc Natl Acad Sci U S A ; 114(14): 3637-3641, 2017 04 04.
Article in English | MEDLINE | ID: mdl-28325875

ABSTRACT

The majority of macromolecular crystal structures are determined using the method of molecular replacement, in which known related structures are rotated and translated to provide an initial atomic model for the new structure. A theoretical understanding of the signal-to-noise ratio in likelihood-based molecular replacement searches has been developed to account for the influence of model quality and completeness, as well as the resolution of the diffraction data. Here we show that, contrary to current belief, molecular replacement need not be restricted to the use of models comprising a substantial fraction of the unknown structure. Instead, likelihood-based methods allow a continuum of applications depending predictably on the quality of the model and the resolution of the data. Unexpectedly, our understanding of the signal-to-noise ratio in molecular replacement leads to the finding that, with data to sufficiently high resolution, fragments as small as single atoms of elements usually found in proteins can yield ab initio solutions of macromolecular structures, including some that elude traditional direct methods.


Subject(s)
Crystallography, X-Ray/methods , Proteins/chemistry , Algorithms , Computational Biology/methods , Likelihood Functions , Models, Molecular , Protein Conformation , Signal-To-Noise Ratio
12.
Acta Crystallogr D Biol Crystallogr ; 70(Pt 1): 144-54, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24419387

ABSTRACT

High-throughput drug-discovery and mechanistic studies often require the determination of multiple related crystal structures that only differ in the bound ligands, point mutations in the protein sequence and minor conformational changes. If performed manually, solution and refinement requires extensive repetition of the same tasks for each structure. To accelerate this process and minimize manual effort, a pipeline encompassing all stages of ligand building and refinement, starting from integrated and scaled diffraction intensities, has been implemented in Phenix. The resulting system is able to successfully solve and refine large collections of structures in parallel without extensive user intervention prior to the final stages of model completion and validation.


Subject(s)
Crystallography, X-Ray/methods , Proteins/chemistry , Animals , Drug Design , Factor Xa/chemistry , Factor Xa/metabolism , HIV Protease/chemistry , HIV Protease/metabolism , HIV-1/enzymology , Humans , Ligands , Models, Molecular , Protein Binding , Proteins/metabolism , Thrombin/chemistry , Thrombin/metabolism
13.
Acta Crystallogr D Biol Crystallogr ; 69(Pt 11): 2209-15, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24189232

ABSTRACT

The estimate of the root-mean-square deviation (r.m.s.d.) in coordinates between the model and the target is an essential parameter for calibrating likelihood functions for molecular replacement (MR). Good estimates of the r.m.s.d. lead to good estimates of the variance term in the likelihood functions, which increases signal to noise and hence success rates in the MR search. Phaser has hitherto used an estimate of the r.m.s.d. that only depends on the sequence identity between the model and target and which was not optimized for the MR likelihood functions. Variance-refinement functionality was added to Phaser to enable determination of the effective r.m.s.d. that optimized the log-likelihood gain (LLG) for a correct MR solution. Variance refinement was subsequently performed on a database of over 21,000 MR problems that sampled a range of sequence identities, protein sizes and protein fold classes. Success was monitored using the translation-function Z-score (TFZ), where a TFZ of 8 or over for the top peak was found to be a reliable indicator that MR had succeeded for these cases with one molecule in the asymmetric unit. Good estimates of the r.m.s.d. are correlated with the sequence identity and the protein size. A new estimate of the r.m.s.d. that uses these two parameters in a function optimized to fit the mean of the refined variance is implemented in Phaser and improves MR outcomes. Perturbing the initial estimate of the r.m.s.d. from the mean of the distribution in steps of standard deviations of the distribution further increases MR success rates.


Subject(s)
Amino Acid Sequence , Amino Acid Substitution , Databases, Protein/trends , Signal-To-Noise Ratio , Amino Acid Sequence/genetics , Amino Acid Substitution/genetics , Crystallography, X-Ray/instrumentation , Crystallography, X-Ray/methods , Databases, Protein/classification , Likelihood Functions , Models, Molecular , Mutation , Protein Folding , Sequence Alignment , Software , X-Ray Diffraction
14.
Acta Crystallogr D Biol Crystallogr ; 69(Pt 11): 2276-86, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24189240

ABSTRACT

Phaser.MRage is a molecular-replacement automation framework that implements a full model-generation workflow and provides several layers of model exploration to the user. It is designed to handle a large number of models and can distribute calculations efficiently onto parallel hardware. In addition, phaser.MRage can identify correct solutions and use this information to accelerate the search. Firstly, it can quickly score all alternative models of a component once a correct solution has been found. Secondly, it can perform extensive analysis of identified solutions to find protein assemblies and can employ assembled models for subsequent searches. Thirdly, it is able to use a priori assembly information (derived from, for example, homologues) to speculatively place and score molecules, thereby customizing the search procedure to a certain class of protein molecule (for example, antibodies) and incorporating additional biological information into molecular replacement.


Subject(s)
Amino Acid Substitution , Computational Biology/methods , Databases, Protein , Software , Artificial Intelligence , Crystallography, X-Ray/methods , Crystallography, X-Ray/trends , Databases, Protein/standards , Models, Molecular , Protein Multimerization , Protein Structure, Tertiary
15.
Methods ; 55(1): 94-106, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21821126

ABSTRACT

X-ray crystallography is a critical tool in the study of biological systems. It is able to provide information that has been a prerequisite to understanding the fundamentals of life. It is also a method that is central to the development of new therapeutics for human disease. Significant time and effort are required to determine and optimize many macromolecular structures because of the need for manual interpretation of complex numerical data, often using many different software packages, and the repeated use of interactive three-dimensional graphics. The Phenix software package has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on automation. This has required the development of new algorithms that minimize or eliminate subjective input in favor of built-in expert-systems knowledge, the automation of procedures that are traditionally performed by hand, and the development of a computational framework that allows a tight integration between the algorithms. The application of automated methods is particularly appropriate in the field of structural proteomics, where high throughput is desired. Features in Phenix for the automation of experimental phasing with subsequent model building, molecular replacement, structure refinement and validation are described and examples given of running Phenix from both the command line and graphical user interface.


Subject(s)
Automation, Laboratory/methods , Crystallography, X-Ray , Data Collection/methods , Proteins/analysis , Proteomics/methods , Software , Algorithms , Automation, Laboratory/instrumentation , Crystallography, X-Ray/instrumentation , Crystallography, X-Ray/methods , High-Throughput Screening Assays , Molecular Structure , Proteins/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...