Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Proteins ; 83(11): 1947-62, 2015 Nov.
Article in English | MEDLINE | ID: mdl-25820805

ABSTRACT

For many membrane proteins, the determination of their topology remains a challenge for methods like X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. Electron paramagnetic resonance (EPR) spectroscopy has evolved as an alternative technique to study structure and dynamics of membrane proteins. The present study demonstrates the feasibility of membrane protein topology determination using limited EPR distance and accessibility measurements. The BCL::MP-Fold (BioChemical Library membrane protein fold) algorithm assembles secondary structure elements (SSEs) in the membrane using a Monte Carlo Metropolis (MCM) approach. Sampled models are evaluated using knowledge-based potential functions and agreement with the EPR data and a knowledge-based energy function. Twenty-nine membrane proteins of up to 696 residues are used to test the algorithm. The RMSD100 value of the most accurate model is better than 8 Å for 27, better than 6 Å for 22, and better than 4 Å for 15 of the 29 proteins, demonstrating the algorithms' ability to sample the native topology. The average enrichment could be improved from 1.3 to 2.5, showing the improved discrimination power by using EPR data.


Subject(s)
Membrane Proteins/chemistry , Membrane Proteins/metabolism , Protein Folding , Electron Spin Resonance Spectroscopy , Magnetic Resonance Spectroscopy , Models, Molecular , Protein Conformation
2.
Proteins ; 82(4): 587-95, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24123100

ABSTRACT

When experimental protein NMR data are too sparse to apply traditional structure determination techniques, de novo protein structure prediction methods can be leveraged. Here, we describe the incorporation of NMR restraints into the protein structure prediction algorithm BCL::Fold. The method assembles discreet secondary structure elements using a Monte Carlo sampling algorithm with a consensus knowledge-based energy function. New components were introduced into the energy function to accommodate chemical shift, nuclear Overhauser effect, and residual dipolar coupling data. In particular, since side chains are not explicitly modeled during the minimization process, a knowledge based potential was created to relate experimental side chain proton-proton distances to Cß -Cß distances. In a benchmark test of 67 proteins of known structure with the incorporation of sparse NMR restraints, the correct topology was sampled in 65 cases, with an average best model RMSD100 of 3.4 ± 1.3 Å versus 6.0 ± 2.0 Å produced with the de novo method. Additionally, the correct topology is present in the best scoring 1% of models in 61 cases. The benchmark set includes both soluble and membrane proteins with up to 565 residues, indicating the method is robust and applicable to large and membrane proteins that are less likely to produce rich NMR datasets.


Subject(s)
Nuclear Magnetic Resonance, Biomolecular/methods , Proteins/chemistry , Proteins/ultrastructure , Algorithms , Models, Chemical , Models, Molecular , Monte Carlo Method , Protein Conformation , Protein Folding , Protein Structure, Secondary , Proteins/metabolism
3.
Structure ; 21(7): 1107-17, 2013 Jul 02.
Article in English | MEDLINE | ID: mdl-23727232

ABSTRACT

Membrane protein structure determination remains a challenging endeavor. Computational methods that predict membrane protein structure from sequence can potentially aid structure determination for such difficult target proteins. The de novo protein structure prediction method BCL::Fold rapidly assembles secondary structure elements into three-dimensional models. Here, we describe modifications to the algorithm, named BCL::MP-Fold, in order to simulate membrane protein folding. Models are built into a static membrane object and are evaluated using a knowledge-based energy potential, which has been modified to account for the membrane environment. Additionally, a symmetry folding mode allows for the prediction of obligate homomultimers, a common property among membrane proteins. In a benchmark test of 40 proteins of known structure, the method sampled the correct topology in 34 cases. This demonstrates that the algorithm can accurately predict protein topology without the need for large multiple sequence alignments, homologous template structures, or experimental restraints.


Subject(s)
Membrane Proteins/chemistry , Algorithms , Databases, Protein , Humans , Hydrophobic and Hydrophilic Interactions , Models, Molecular , Monte Carlo Method , Protein Folding , Protein Structure, Secondary , Protein Structure, Tertiary , Protein Subunits/chemistry , Sequence Analysis, Protein , Solubility
4.
Proteins ; 81(7): 1127-40, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23349002

ABSTRACT

Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α-helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three-state secondary structure prediction, and 94.8% for three-state transmembrane span prediction. These accuracies are comparable to state-of-the-art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org.


Subject(s)
Membrane Proteins/chemistry , Neural Networks, Computer , Protein Structure, Secondary , Proteins/chemistry , Algorithms , Amino Acid Sequence , Databases, Protein , Membrane Proteins/classification , Sequence Alignment , Software
5.
PLoS One ; 7(11): e49240, 2012.
Article in English | MEDLINE | ID: mdl-23173050

ABSTRACT

Computational de novo protein structure prediction is limited to small proteins of simple topology. The present work explores an approach to extend beyond the current limitations through assembling protein topologies from idealized α-helices and ß-strands. The algorithm performs a Monte Carlo Metropolis simulated annealing folding simulation. It optimizes a knowledge-based potential that analyzes radius of gyration, ß-strand pairing, secondary structure element (SSE) packing, amino acid pair distance, amino acid environment, contact order, secondary structure prediction agreement and loop closure. Discontinuation of the protein chain favors sampling of non-local contacts and thereby creation of complex protein topologies. The folding simulation is accelerated through exclusion of flexible loop regions further reducing the size of the conformational search space. The algorithm is benchmarked on 66 proteins with lengths between 83 and 293 amino acids. For 61 out of these proteins, the best SSE-only models obtained have an RMSD100 below 8.0 Å and recover more than 20% of the native contacts. The algorithm assembles protein topologies with up to 215 residues and a relative contact order of 0.46. The method is tailored to be used in conjunction with low-resolution or sparse experimental data sets which often provide restraints for regions of defined secondary structure.


Subject(s)
Algorithms , Computational Biology/methods , Proteins/chemistry , Benchmarking , Humans , Models, Molecular , Monte Carlo Method , Protein Structure, Secondary , Quality Control
6.
PLoS One ; 7(11): e49242, 2012.
Article in English | MEDLINE | ID: mdl-23173051

ABSTRACT

The topology of most experimentally determined protein domains is defined by the relative arrangement of secondary structure elements, i.e. α-helices and ß-strands, which make up 50-70% of the sequence. Pairing of ß-strands defines the topology of ß-sheets. The packing of side chains between α-helices and ß-sheets defines the majority of the protein core. Often, limited experimental datasets restrain the position of secondary structure elements while lacking detail with respect to loop or side chain conformation. At the same time the regular structure and reduced flexibility of secondary structure elements make these interactions more predictable when compared to flexible loops and side chains. To determine the topology of the protein in such settings, we introduce a tailored knowledge-based energy function that evaluates arrangement of secondary structure elements only. Based on the amino acid C(ß) atom coordinates within secondary structure elements, potentials for amino acid pair distance, amino acid environment, secondary structure element packing, ß-strand pairing, loop length, radius of gyration, contact order and secondary structure prediction agreement are defined. Separate penalty functions exclude conformations with clashes between amino acids or secondary structure elements and loops that cannot be closed. Each individual term discriminates for native-like protein structures. The composite potential significantly enriches for native-like models in three different databases of 10,000-12,000 protein models in 80-94% of the cases. The corresponding application, "BCL::ScoreProtein," is available at www.meilerlab.org.


Subject(s)
Computational Biology/methods , Models, Molecular , Proteins/chemistry , Algorithms , Bayes Theorem , Protein Structure, Secondary , Protein Structure, Tertiary , Rotation , Thermodynamics
7.
Structure ; 20(3): 464-78, 2012 Mar 07.
Article in English | MEDLINE | ID: mdl-22405005

ABSTRACT

Electron density maps of membrane proteins or large macromolecular complexes are frequently only determined at medium resolution between 4 Å and 10 Å, either by cryo-electron microscopy or X-ray crystallography. In these density maps, the general arrangement of secondary structure elements (SSEs) is revealed, whereas their directionality and connectivity remain elusive. We demonstrate that the topology of proteins with up to 250 amino acids can be determined from such density maps when combined with a computational protein folding protocol. Furthermore, we accurately reconstruct atomic detail in loop regions and amino acid side chains not visible in the experimental data. The EM-Fold algorithm assembles the SSEs de novo before atomic detail is added using Rosetta. In a benchmark of 27 proteins, the protocol consistently and reproducibly achieves models with root mean square deviation values <3 Å.


Subject(s)
Algorithms , Macromolecular Substances/chemistry , Membrane Proteins/chemistry , Models, Molecular , Molecular Biology/methods , Protein Conformation , Software , Cryoelectron Microscopy/methods , Crystallography, X-Ray/methods , Protein Folding
8.
Biopolymers ; 97(9): 669-77, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22302372

ABSTRACT

EM-Fold was used to build models for nine proteins in the maps of GroEL (7.7 Å resolution) and ribosome (6.4 Å resolution) in the ab initio modeling category of the 2010 cryo-electron microscopy modeling challenge. EM-Fold assembles predicted secondary structure elements (SSEs) into regions of the density map that were identified to correspond to either α-helices or ß-strands. The assembly uses a Monte Carlo algorithm where loop closure, density-SSE length agreement, and strength of connecting density between SSEs are evaluated. Top-scoring models are refined by translating, rotating, and bending SSEs to yield better agreement with the density map. EM-Fold produces models that contain backbone atoms within SSEs only. The RMSD values of the models with respect to native range from 2.4 to 3.5 Å for six of the nine proteins. EM-Fold failed to predict the correct topology in three cases. Subsequently, Rosetta was used to build loops and side chains for the very best scoring models after EM-Fold refinement. The refinement within Rosetta's force field is driven by a density agreement score that calculates a cross-correlation between a density map simulated from the model and the experimental density map. All-atom RMSDs as low as 3.4 Å are achieved in favorable cases. Values above 10.0 Å are observed for two proteins with low overall content of secondary structure and hence particularly complex loop modeling problems. RMSDs over residues in secondary structure elements range from 2.5 to 4.8 Å.


Subject(s)
Computational Biology/methods , Cryoelectron Microscopy/methods , Protein Folding , Proteins/chemistry , Models, Molecular , Protein Structure, Secondary
9.
J Comput Biol ; 17(2): 153-68, 2010 Feb.
Article in English | MEDLINE | ID: mdl-19772383

ABSTRACT

Knowledge of all residue-residue contacts within a protein allows determination of the protein fold. Accurate prediction of even a subset of long-range contacts (contacts between amino acids far apart in sequence) can be instrumental for determining tertiary structure. Here we present BCL::Contact, a novel contact prediction method that utilizes artificial neural networks (ANNs) and specializes in the prediction of medium to long-range contacts. BCL::Contact comes in two modes: sequence-based and structure-based. The sequence-based mode uses only sequence information and has individual ANNs specialized for helix-helix, helix-strand, strand-helix, strand-strand, and sheet-sheet contacts. The structure-based mode combines results from 32-fold recognition methods with sequence information to a consensus prediction. The two methods were presented in the 6(th) and 7(th) Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments. The present work focuses on elucidating the impact of fold recognition results onto contact prediction via a direct comparison of both methods on a joined benchmark set of proteins. The sequence-based mode predicted contacts with 42% accuracy (7% false positive rate), while the structure-based mode achieved 45% accuracy (2% false positive rate). Predictions by both modes of BCL::Contact were supplied as input to the protein tertiary structure prediction program Rosetta for a benchmark of 17 proteins with no close sequence homologs in the protein data bank (PDB). Rosetta created higher accuracy models, signified by an improvement of 1.3 A on average root mean square deviation (RMSD), when driven by the predicted contacts. Further, filtering Rosetta models by agreement with the predicted contacts enriches for native-like fold topologies.


Subject(s)
Caspase 7/chemistry , Models, Molecular , Protein Folding , Sequence Analysis, Protein , Algorithms , Computer Simulation , Humans , Neural Networks, Computer , Protein Conformation
10.
Structure ; 17(7): 990-1003, 2009 Jul 15.
Article in English | MEDLINE | ID: mdl-19604479

ABSTRACT

In medium-resolution (7-10 A) cryo-electron microscopy (cryo-EM) density maps, alpha helices can be identified as density rods whereas beta-strand or loop regions are not as easily discerned. We are proposing a computational protein structure prediction algorithm "EM-Fold" that resolves the density rod connectivity ambiguity by placing predicted alpha helices into the density rods and adding missing backbone coordinates in loop regions. In a benchmark of 11 mainly alpha-helical proteins of known structure a native-like model is identified in eight cases (rmsd 3.9-7.9 A). The three failures can be attributed to inaccuracies in the secondary structure prediction step that precedes EM-Fold. EM-Fold has been applied to the approximately 6 A resolution cryo-EM density map of protein IIIa from human adenovirus. We report the first topological model for the alpha-helical 400 residue N-terminal region of protein IIIa. EM-Fold also has the potential to interpret medium-resolution density maps in X-ray crystallography.


Subject(s)
Microscopy, Electron , Protein Folding , Protein Structure, Secondary , Proteins/chemistry , Software , Adenoviruses, Human/chemistry , Adenoviruses, Human/ultrastructure , Algorithms , Amino Acid Sequence , Animals , Cattle , Computational Biology , Computer Simulation , Cryoelectron Microscopy , Crystallography, X-Ray , Databases, Protein , Humans , Models, Chemical , Models, Molecular , Molecular Sequence Data , Monte Carlo Method , Protein Conformation , ROC Curve , Rhodopsin/chemistry , Rhodopsin/ultrastructure , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...