Search | VHL Regional Portal

1.

The evolutionary history of topological variations in the CPA/AT transporters.

Sudha, Govindarajan; Bassot, Claudio; Lamb, John; Shu, Nanjiang; Huang, Yan; Elofsson, Arne.

PLoS Comput Biol ; 17(8): e1009278, 2021 08.

Article in English | MEDLINE | ID: mdl-34403419

ABSTRACT

CPA/AT transporters are made up of scaffold and a core domain. The core domain contains two non-canonical helices (broken or reentrant) that mediate the transport of ions, amino acids or other charged compounds. During evolution, these transporters have undergone substantial changes in structure, topology and function. To shed light on these structural transitions, we create models for all families using an integrated topology annotation method. We find that the CPA/AT transporters can be classified into four fold-types based on their structure; (1) the CPA-broken fold-type, (2) the CPA-reentrant fold-type, (3) the BART fold-type, and (4) a previously not described fold-type, the Reentrant-Helix-Reentrant fold-type. Several topological transitions are identified, including the transition between a broken and reentrant helix, one transition between a loop and a reentrant helix, complete changes of orientation, and changes in the number of scaffold helices. These transitions are mainly caused by gene duplication and shuffling events. Structural models, topology information and other details are presented in a searchable database, CPAfold (cpafold.bioinfo.se).

Subject(s)

Evolution, Molecular , Membrane Transport Proteins/chemistry , Animals , Humans , Models, Molecular , Protein Conformation

2.

Improved protein model quality assessments by changing the target function.

Uziela, Karolis; Menéndez Hurtado, David; Shu, Nanjiang; Wallner, Björn; Elofsson, Arne.

Proteins ; 86(6): 654-663, 2018 06.

Article in English | MEDLINE | ID: mdl-29524250

ABSTRACT

Protein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods for this problem. We have used various types of description of the protein and different machine learning methodologies. However, common to all these methods has been the target function used for training. The target function in ProQ describes the local quality of a residue in a protein model. In all versions of ProQ the target function has been the S-score. However, other quality estimation functions also exist, which can be divided into superposition- and contact-based methods. The superposition-based methods, such as S-score, are based on a rigid body superposition of a protein model and the native structure, while the contact-based methods compare the local environment of each residue. Here, we examine the effects of retraining our latest predictor, ProQ3D, using identical inputs but different target functions. We find that the contact-based methods are easier to predict and that predictors trained on these measures provide some advantages when it comes to identifying the best model. One possible reason for this is that contact based methods are better at estimating the quality of multi-domain targets. However, training on the S-score gives the best correlation with the GDT_TS score, which is commonly used in CASP to score the global model quality. To take the advantage of both of these features we provide an updated version of ProQ3D that predicts local and global model quality estimates based on different quality estimates.

Subject(s)

Models, Molecular , Proteins/chemistry , Algorithms , Databases, Protein , Machine Learning , Protein Conformation , Software , Structure-Activity Relationship

3.

Topology of membrane proteins-predictions, limitations and variations.

Tsirigos, Konstantinos D; Govindarajan, Sudha; Bassot, Claudio; Västermark, Åke; Lamb, John; Shu, Nanjiang; Elofsson, Arne.

Curr Opin Struct Biol ; 50: 9-17, 2018 06.

Article in English | MEDLINE | ID: mdl-29100082

ABSTRACT

Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins.

Subject(s)

Membrane Proteins/chemistry , Models, Molecular , Quantitative Structure-Activity Relationship , Computational Biology/methods , Computer Simulation , Databases, Protein , Protein Conformation , Software

4.

ProQ3D: improved model quality assessments using deep learning.

Uziela, Karolis; Menéndez Hurtado, David; Shu, Nanjiang; Wallner, Björn; Elofsson, Arne.

Bioinformatics ; 33(10): 1578-1580, 2017 May 15.

Article in English | MEDLINE | ID: mdl-28052925

ABSTRACT

SUMMARY: Protein quality assessment is a long-standing problem in bioinformatics. For more than a decade we have developed state-of-art predictors by carefully selecting and optimising inputs to a machine learning method. The correlation has increased from 0.60 in ProQ to 0.81 in ProQ2 and 0.85 in ProQ3 mainly by adding a large set of carefully tuned descriptions of a protein. Here, we show that a substantial improvement can be obtained using exactly the same inputs as in ProQ2 or ProQ3 but replacing the support vector machine by a deep neural network. This improves the Pearson correlation to 0.90 (0.85 using ProQ2 input features). AVAILABILITY AND IMPLEMENTATION: ProQ3D is freely available both as a webserver and a stand-alone program at http://proq3.bioinfo.se/. CONTACT: arne@bioinfo.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Computational Biology/methods , Neural Networks, Computer , Protein Conformation , Software , Support Vector Machine , Models, Molecular

5.

ProQ3: Improved model quality assessments using Rosetta energy terms.

Uziela, Karolis; Shu, Nanjiang; Wallner, Björn; Elofsson, Arne.

Sci Rep ; 6: 33509, 2016 10 04.

Article in English | MEDLINE | ID: mdl-27698390

ABSTRACT

Quality assessment of protein models using no other information than the structure of the model itself has been shown to be useful for structure prediction. Here, we introduce two novel methods, ProQRosFA and ProQRosCen, inspired by the state-of-art method ProQ2, but using a completely different description of a protein model. ProQ2 uses contacts and other features calculated from a model, while the new predictors are based on Rosetta energies: ProQRosFA uses the full-atom energy function that takes into account all atoms, while ProQRosCen uses the coarse-grained centroid energy function. The two new predictors also include residue conservation and terms corresponding to the agreement of a model with predicted secondary structure and surface area, as in ProQ2. We show that the performance of these predictors is on par with ProQ2 and significantly better than all other model quality assessment programs. Furthermore, we show that combining the input features from all three predictors, the resulting predictor ProQ3 performs better than any of the individual methods. ProQ3, ProQRosFA and ProQRosCen are freely available both as a webserver and stand-alone programs at http://proq3.bioinfo.se/.

Subject(s)

Algorithms , Models, Molecular , Proteins/chemistry , Databases, Protein , Statistics, Nonparametric , Support Vector Machine , Thermodynamics , Time Factors

6.

Inclusion of dyad-repeat pattern improves topology prediction of transmembrane ß-barrel proteins.

Hayat, Sikander; Peters, Christoph; Shu, Nanjiang; Tsirigos, Konstantinos D; Elofsson, Arne.

Bioinformatics ; 32(10): 1571-3, 2016 05 15.

Article in English | MEDLINE | ID: mdl-26794316

ABSTRACT

UNLABELLED: : Accurate topology prediction of transmembrane ß-barrels is still an open question. Here, we present BOCTOPUS2, an improved topology prediction method for transmembrane ß-barrels that can also identify the barrel domain, predict the topology and identify the orientation of residues in transmembrane ß-strands. The major novelty of BOCTOPUS2 is the use of the dyad-repeat pattern of lipid and pore facing residues observed in transmembrane ß-barrels. In a cross-validation test on a benchmark set of 42 proteins, BOCTOPUS2 predicts the correct topology in 69% of the proteins, an improvement of more than 10% over the best earlier method (BOCTOPUS) and in addition, it produces significantly fewer erroneous predictions on non-transmembrane ß-barrel proteins. AVAILABILITY AND IMPLEMENTATION: BOCTOPUS2 webserver along with full dataset and source code is available at http://boctopus.bioinfo.se/ CONTACT: : arne@bioinfo.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Membrane Proteins/chemistry , Computational Biology , Models, Molecular , Programming Languages , Protein Structure, Secondary

7.

Improved topology prediction using the terminal hydrophobic helices rule.

Peters, Christoph; Tsirigos, Konstantinos D; Shu, Nanjiang; Elofsson, Arne.

Bioinformatics ; 32(8): 1158-62, 2016 04 15.

Article in English | MEDLINE | ID: mdl-26644416

ABSTRACT

MOTIVATION: The translocon recognizes sufficiently hydrophobic regions of a protein and inserts them into the membrane. Computational methods try to determine what hydrophobic regions are recognized by the translocon. Although these predictions are quite accurate, many methods still fail to distinguish marginally hydrophobic transmembrane (TM) helices and equally hydrophobic regions in soluble protein domains. In vivo, this problem is most likely avoided by targeting of the TM-proteins, so that non-TM proteins never see the translocon. Proteins are targeted to the translocon by an N-terminal signal peptide. The targeting is also aided by the fact that the N-terminal helix is more hydrophobic than other TM-helices. In addition, we also recently found that the C-terminal helix is more hydrophobic than central helices. This information has not been used in earlier topology predictors. RESULTS: Here, we use the fact that the N- and C-terminal helices are more hydrophobic to develop a new version of the first-principle-based topology predictor, SCAMPI. The new predictor has two main advantages; first, it can be used to efficiently separate membrane and non-membrane proteins directly without the use of an extra prefilter, and second it shows improved performance for predicting the topology of membrane proteins that contain large non-membrane domains. AVAILABILITY AND IMPLEMENTATION: The predictor, a web server and all datasets are available at http://scampi.bioinfo.se/ CONTACT: arne@bioinfo.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Hydrophobic and Hydrophilic Interactions , Protein Structure, Secondary , Computational Biology , Forecasting , Membrane Proteins , Protein Sorting Signals

8.

The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides.

Tsirigos, Konstantinos D; Peters, Christoph; Shu, Nanjiang; Käll, Lukas; Elofsson, Arne.

Nucleic Acids Res ; 43(W1): W401-7, 2015 Jul 01.

Article in English | MEDLINE | ID: mdl-25969446

ABSTRACT

TOPCONS (http://topcons.net/) is a widely used web server for consensus prediction of membrane protein topology. We hereby present a major update to the server, with some substantial improvements, including the following: (i) TOPCONS can now efficiently separate signal peptides from transmembrane regions. (ii) The server can now differentiate more successfully between globular and membrane proteins. (iii) The server now is even slightly faster, although a much larger database is used to generate the multiple sequence alignments. For most proteins, the final prediction is produced in a matter of seconds. (iv) The user-friendly interface is retained, with the additional feature of submitting batch files and accessing the server programmatically using standard interfaces, making it thus ideal for proteome-wide analyses. Indicatively, the user can now scan the entire human proteome in a few days. (v) For proteins with homology to a known 3D structure, the homology-inferred topology is also displayed. (vi) Finally, the combination of methods currently implemented achieves an overall increase in performance by 4% as compared to the currently available best-scoring methods and TOPCONS is the only method that can identify signal peptides and still maintain a state-of-the-art performance in topology predictions.

Subject(s)

Membrane Proteins/chemistry , Protein Sorting Signals , Software , Algorithms , Humans , Internet , Protein Conformation , Structural Homology, Protein

9.

Large tilts in transmembrane helices can be induced during tertiary structure formation.

Virkki, Minttu; Boekel, Carolina; Illergård, Kristoffer; Peters, Christoph; Shu, Nanjiang; Tsirigos, Konstantinos D; Elofsson, Arne; von Heijne, Gunnar; Nilsson, IngMarie.

J Mol Biol ; 426(13): 2529-38, 2014 Jun 26.

Article in English | MEDLINE | ID: mdl-24793448

ABSTRACT

While early structural models of helix-bundle integral membrane proteins posited that the transmembrane α-helices [transmembrane helices (TMHs)] were orientated more or less perpendicular to the membrane plane, there is now ample evidence from high-resolution structures that many TMHs have significant tilt angles relative to the membrane. Here, we address the question whether the tilt is an intrinsic property of the TMH in question or if it is imparted on the TMH during folding of the protein. Using a glycosylation mapping technique, we show that four highly tilted helices found in multi-spanning membrane proteins all have much shorter membrane-embedded segments when inserted by themselves into the membrane than seen in the high-resolution structures. This suggests that tilting can be induced by tertiary packing interactions within the protein, subsequent to the initial membrane-insertion step.

Subject(s)

Membrane Proteins/chemistry , Amino Acid Sequence , Databases, Protein , Glycosylation , Membrane Proteins/genetics , Models, Molecular , Molecular Sequence Data , Peptide Mapping , Protein Folding , Protein Structure, Secondary , Protein Structure, Tertiary

10.

KalignP: improved multiple sequence alignments using position specific gap penalties in Kalign2.

Shu, Nanjiang; Elofsson, Arne.

Bioinformatics ; 27(12): 1702-3, 2011 Jun 15.

Article in English | MEDLINE | ID: mdl-21505030

ABSTRACT

SUMMARY: Kalign2 is one of the fastest and most accurate methods for multiple alignments. However, in contrast to other methods Kalign2 does not allow externally supplied position specific gap penalties. Here, we present a modification to Kalign2, KalignP, so that it accepts such penalties. Further, we show that KalignP using position specific gap penalties obtained from predicted secondary structures makes steady improvement over Kalign2 when tested on Balibase 3.0 as well as on a dataset derived from Pfam-A seed alignments. AVAILABILITY AND IMPLEMENTATION: KalignP is freely available at http://kalignp.cbr.su.se. The source code of KalignP is available under the GNU General Public License, Version 2 or later from the same website.

Subject(s)

Sequence Alignment/methods , Sequence Analysis, Protein , Software , Protein Structure, Secondary

11.

A novel method for accurate one-dimensional protein structure prediction based on fragment matching.

Zhou, Tuping; Shu, Nanjiang; Hovmöller, Sven.

Bioinformatics ; 26(4): 470-7, 2010 Feb 15.

Article in English | MEDLINE | ID: mdl-20007252

ABSTRACT

MOTIVATION: The precise prediction of one-dimensional (1D) protein structure as represented by the protein secondary structure and 1D string of discrete state of dihedral angles (i.e. Shape Strings) is a prerequisite for the successful prediction of three-dimensional (3D) structure as well as protein-protein interaction. We have developed a novel 1D structure prediction method, called Frag1D, based on a straightforward fragment matching algorithm and demonstrated its success in the prediction of three sets of 1D structural alphabets, i.e. the classical three-state secondary structure, three- and eight-state Shape Strings. RESULTS: By exploiting the vast protein sequence and protein structure data available, we have brought secondary-structure prediction closer to the expected theoretical limit. When tested by a leave-one-out cross validation on a non-redundant set of PDB cutting at 30% sequence identity containing 5860 protein chains, the overall per-residue accuracy for secondary-structure prediction, i.e. Q3 is 82.9%. The overall per-residue accuracy for three- and eight-state Shape Strings are 85.1 and 71.5%, respectively. We have also benchmarked our program with the latest version of PSIPRED for secondary structure prediction and our program predicted 0.3% better in Q3 when tested on 2241 chains with the same training set. For Shape Strings, we compared our method with a recently published method with the same dataset and definition as used by that method. Our program predicted at 2.2% better in accuracy for three-state Shape Strings. By quantitatively investigating the effect of data base size on 1D structure prediction we show that the accuracy increases by approximately 1% with every doubling of the database size.

Subject(s)

Computational Biology/methods , Protein Structure, Secondary , Proteins/chemistry , Databases, Protein , Models, Molecular , Sequence Analysis, Protein/methods

12.

Describing and comparing protein structures using shape strings.

Shu, Nanjiang; Hovmöller, Sven; Zhou, Tuping.

Curr Protein Pept Sci ; 9(4): 310-24, 2008 Aug.

Article in English | MEDLINE | ID: mdl-18691122

ABSTRACT

Different methods for describing and comparing the structures of the tens of thousands of proteins that have been determined by X-ray crystallography are reviewed. Such comparisons are important for understanding the structures and functions of proteins and facilitating structure prediction, as well as assessing structure prediction methods. We summarize methods in this field emphasizing ways of representing protein structures as one-dimensional geometrical strings. Such strings are based on the shape symbols of clustered regions of phi/Psi dihedral angle pairs of the polypeptide backbones as described by the Ramachandran plot. These one-dimensional expressions are as compact as secondary structure description but contain more information in loop regions. They can be used for fast searching for similar structures in databases and for comparing similarities between proteins and between the predicted and native structures.

Subject(s)

Models, Molecular , Protein Conformation , Proteins/chemistry , Amino Acid Sequence , Crystallography, X-Ray , Molecular Sequence Data , Sequence Alignment

13.

Prediction of zinc-binding sites in proteins from sequence.

Shu, Nanjiang; Zhou, Tuping; Hovmöller, Sven.

Bioinformatics ; 24(6): 775-82, 2008 Mar 15.

Article in English | MEDLINE | ID: mdl-18245129

ABSTRACT

MOTIVATION: Motivated by the abundance, importance and unique functionality of zinc, both biologically and physiologically, we have developed an improved method for the prediction of zinc-binding sites in proteins from their amino acid sequences. RESULTS: By combining support vector machine (SVM) and homology-based predictions, our method predicts zinc-binding Cys, His, Asp and Glu with 75% precision (86% for Cys and His only) at 50% recall according to a 5-fold cross-validation on a non-redundant set of protein chains from the Protein Data Bank (PDB) (2727 chains, 235 of which bind zinc). Consequently, our method predicts zinc-binding Cys and His with 10% higher precision at different recall levels compared to a recently published method when tested on the same dataset. AVAILABILITY: The program is available for download at www.fos.su.se/~nanjiang/zincpred/download/

Subject(s)

Algorithms , Metalloproteins/chemistry , Models, Chemical , Models, Molecular , Protein Interaction Mapping/methods , Sequence Analysis, Protein/methods , Zinc/chemistry , Amino Acid Sequence , Binding Sites , Computer Simulation , Molecular Sequence Data , Protein Binding

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL