Search | VHL Regional Portal

DAMA: a method for computing multiple alignments of protein structures using local structure descriptors.

Daniluk, Pawel; Oleniecki, Tymoteusz; Lesyng, Bogdan.

Bioinformatics ; 38(1): 80-85, 2021 12 22.

Article in English | MEDLINE | ID: mdl-34396393

ABSTRACT

MOTIVATION: The well-known fact that protein structures are more conserved than their sequences forms the basis of several areas of computational structural biology. Methods based on the structure analysis provide more complete information on residue conservation in evolutionary processes. This is crucial for the determination of evolutionary relationships between proteins and for the identification of recurrent structural patterns present in biomolecules involved in similar functions. However, algorithmic structural alignment is much more difficult than multiple sequence alignment. This study is devoted to the development and applications of DAMA-a novel effective environment capable to compute and analyze multiple structure alignments. RESULTS: DAMA is based on local structural similarities, using local 3D structure descriptors and thus accounts for nearest-neighbor molecular environments of aligned residues. It is constrained neither by protein topology nor by its global structure. DAMA is an extension of our previous study (DEDAL) which demonstrated the applicability of local descriptors to pairwise alignment problems. Since the multiple alignment problem is NP-complete, an effective heuristic approach has been developed without imposing any artificial constraints. The alignment algorithm searches for the largest, consistent ensemble of similar descriptors. The new method is capable to capture most of the biologically significant similarities present in canonical test sets and is discriminatory enough to prevent the emergence of larger, but meaningless, solutions. Tests performed on the test sets, including protein kinases, demonstrate DAMA's capability of identifying equivalent residues, which should be very useful in discovering the biological nature of proteins similarity. Performance profiles show the advantage of DAMA over other methods, in particular when using a strict similarity measure QC, which is the ratio of correctly aligned columns, and when applying the methods to more difficult cases. AVAILABILITY AND IMPLEMENTATION: DAMA is available online at http://dworkowa.imdik.pan.pl/EP/DAMA. Linux binaries of the software are available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Proteins , Software , Proteins/chemistry , Algorithms , Sequence Alignment , Computational Biology/methods

ResiCon: a method for the identification of dynamic domains, hinges and interfacial regions in proteins.

Dziubinski, Maciej; Daniluk, Pawel; Lesyng, Bogdan.

Bioinformatics ; 32(1): 25-34, 2016 Jan 01.

Article in English | MEDLINE | ID: mdl-26342233

ABSTRACT

MOTIVATION: Structure of most proteins is flexible. Identification and analysis of intramolecular motions is a complex problem. Breaking a structure into relatively rigid parts, the so-called dynamic domains, may help comprehend the complexity of protein's mobility. We propose a new approach called ResiCon (Residue Contacts analysis), which performs this task by applying a data-mining analysis of an ensemble of protein configurations and recognizes dynamic domains, hinges and interfacial regions, by considering contacts between residues. RESULTS: Dynamic domains found by ResiCon are more compact than those identified by two other popular methods: PiSQRD and GeoStaS. The current analysis was carried out using a known reference set of 30 NMR protein structures, as well as molecular dynamics simulation data of flap opening events in HIV-1 protease. The more detailed analysis of HIV-1 protease dataset shows that ResiCon identified dynamic domains involved in structural changes of functional importance. AVAILABILITY AND IMPLEMENTATION: The ResiCon server is available at URL: http://dworkowa.imdik.pan.pl/EP/ResiCon. CONTACT: pawel@bioexploratorium.pl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Computational Biology/methods , HIV Protease/chemistry , Software , Algorithms , Cluster Analysis , HIV-1/enzymology , Magnetic Resonance Spectroscopy , Molecular Dynamics Simulation , Protein Structure, Tertiary , Stochastic Processes

WeBIAS: a web server for publishing bioinformatics applications.

Daniluk, Pawel; Wilczynski, Bartek; Lesyng, Bogdan.

BMC Res Notes ; 8: 628, 2015 Nov 02.

Article in English | MEDLINE | ID: mdl-26526344

ABSTRACT

BACKGROUND: One of the requirements for a successful scientific tool is its availability. Developing a functional web service, however, is usually considered a mundane and ungratifying task, and quite often neglected. When publishing bioinformatic applications, such attitude puts additional burden on the reviewers who have to cope with poorly designed interfaces in order to assess quality of presented methods, as well as impairs actual usefulness to the scientific community at large. RESULTS: In this note we present WeBIAS-a simple, self-contained solution to make command-line programs accessible through web forms. It comprises a web portal capable of serving several applications and backend schedulers which carry out computations. The server handles user registration and authentication, stores queries and results, and provides a convenient administrator interface. WeBIAS is implemented in Python and available under GNU Affero General Public License. It has been developed and tested on GNU/Linux compatible platforms covering a vast majority of operational WWW servers. Since it is written in pure Python, it should be easy to deploy also on all other platforms supporting Python (e.g. Windows, Mac OS X). Documentation and source code, as well as a demonstration site are available at http://bioinfo.imdik.pan.pl/webias . CONCLUSIONS: WeBIAS has been designed specifically with ease of installation and deployment of services in mind. Setting up a simple application requires minimal effort, yet it is possible to create visually appealing, feature-rich interfaces for query submission and presentation of results.

Subject(s)

Computational Biology/methods , Internet , Publishing , Software , Databases, Factual , Documentation/methods , Information Storage and Retrieval/methods , Reproducibility of Results , User-Computer Interface

A novel method to compare protein structures using local descriptors.

Daniluk, Pawel; Lesyng, Bogdan.

BMC Bioinformatics ; 12: 344, 2011 Aug 17.

Article in English | MEDLINE | ID: mdl-21849047

ABSTRACT

BACKGROUND: Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. RESULTS: We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). CONCLUSIONS: DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL.

Subject(s)

Algorithms , Computational Biology/methods , Proteins/chemistry , Structural Homology, Protein , Animals , Bacterial Proteins/chemistry , Carrier Proteins/chemistry , GTP Phosphohydrolases/chemistry , Humans , Models, Molecular , Saposins/chemistry

Protein structure prediction center in CASP8.

Kryshtafovych, Andriy; Krysko, Oleh; Daniluk, Pawel; Dmytriv, Zinovii; Fidelis, Krzysztof.

Proteins ; 77 Suppl 9: 5-9, 2009.

Article in English | MEDLINE | ID: mdl-19722263

ABSTRACT

We present an outline of the Critical Assessment of Protein Structure Prediction (CASP) infrastructure implemented at the University of California, Davis, Protein Structure Prediction Center. The infrastructure supports selection and validation of prediction targets, collection of predictions, standard evaluation of submitted predictions, and presentation of results. The Center also supports information exchange relating to CASP experiments and structure prediction in general. Technical aspects of conducting the CASP8 experiment and relevant statistics are also provided.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Databases, Protein , Models, Molecular , Protein Conformation , Software

Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts.

Björkholm, Patrik; Daniluk, Pawel; Kryshtafovych, Andriy; Fidelis, Krzysztof; Andersson, Robin; Hvidsten, Torgeir R.

Bioinformatics ; 25(10): 1264-70, 2009 May 15.

Article in English | MEDLINE | ID: mdl-19289446

ABSTRACT

MOTIVATION: Correct prediction of residue-residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail. RESULTS: We propose a novel hidden Markov model (HMM)-based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities incorporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 x L predictions (L = sequence length), our HMMs obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature. AVAILABILITY: http://predictioncenter.org/Services/FragHMMent/.

Subject(s)

Computational Biology/methods , Markov Chains , Proteins/chemistry , Databases, Protein , Models, Molecular , Protein Folding , Protein Structure, Secondary

Interaction model based on local protein substructures generalizes to the entire structural enzyme-ligand space.

Strömbergsson, Helena; Daniluk, Pawel; Kryshtafovych, Andriy; Fidelis, Krzysztof; Wikberg, Jarl E S; Kleywegt, Gerard J; Hvidsten, Torgeir R.

J Chem Inf Model ; 48(11): 2278-88, 2008 Nov.

Article in English | MEDLINE | ID: mdl-18937438

ABSTRACT

Chemogenomics is a new strategy in in silico drug discovery, where the ultimate goal is to understand molecular recognition for all molecules interacting with all proteins in the proteome. To study such cross interactions, methods that can generalize over proteins that vary greatly in sequence, structure, and function are needed. We present a general quantitative approach to protein-ligand binding affinity prediction that spans the entire structural enzyme-ligand space. The model was trained on a data set composed of all available enzymes cocrystallized with druglike ligands, taken from four publicly available interaction databases, for which a crystal structure is available. Each enzyme was characterized by a set of local descriptors of protein structure that describe the binding site of the cocrystallized ligand. The ligands in the training set were described by traditional QSAR descriptors. To evaluate the model, a comprehensive test set consisting of enzyme structures and ligands was manually curated. The test set contained enzyme-ligand complexes for which no crystal structures were available, and thus the binding modes were unknown. The test set enzymes were therefore characterized by matching their entire structures to the local descriptor library constructed from the training set. Both the training and the test set contained enzyme-ligand complexes from all major enzyme classes, and the enzymes spanned a large range of sequences and folds. The experimental binding affinities (p K i) ranged from 0.5 to 11.9 (0.7-11.0 in the test set). The induced model predicted the binding affinities of the external test set enzyme-ligand complexes with an r (2) of 0.53 and an RMSEP of 1.5. This demonstrates that the use of local descriptors makes it possible to create rough predictive models that can generalize over a wide range of protein targets.

Subject(s)

Enzymes/chemistry , Models, Molecular , Animals , Artificial Intelligence , Cluster Analysis , Computer Simulation , Databases, Protein , Dihydroorotate Dehydrogenase , Drug Discovery , Enzymes/metabolism , Informatics , Kinetics , Ligands , Molecular Structure , Oxidoreductases Acting on CH-CH Group Donors/chemistry , Oxidoreductases Acting on CH-CH Group Donors/metabolism , Oxidoreductases Acting on CH-NH Group Donors/chemistry , Oxidoreductases Acting on CH-NH Group Donors/metabolism , Plasmodium falciparum/enzymology , Protein Conformation , Zea mays/enzymology , Polyamine Oxidase

New tools and expanded data analysis capabilities at the Protein Structure Prediction Center.

Kryshtafovych, Andriy; Prlic, Andreas; Dmytriv, Zinoviy; Daniluk, Pawel; Milostan, Maciej; Eyrich, Volker; Hubbard, Tim; Fidelis, Krzysztof.

Proteins ; 69 Suppl 8: 19-26, 2007.

Article in English | MEDLINE | ID: mdl-17705273

ABSTRACT

We outline the main tasks performed by the Protein Structure Prediction Center in support of the CASP7 experiment and provide a brief review of the major measures used in the automatic evaluation of predictions. We describe in more detail the software developed to facilitate analysis of modeling success over and beyond the available templates and the adopted Java-based tool enabling visualization of multiple structural superpositions between target and several models/templates. We also give an overview of the CASP infrastructure provided by the Center and discuss the organization of the results web pages available through http://predictioncenter.org.

Subject(s)

Computational Biology/methods , Protein Conformation , Software , Internet , Models, Molecular , Protein Folding , Proteins/chemistry , Structure-Activity Relationship

CASP6 data processing and automatic evaluation at the protein structure prediction center.

Kryshtafovych, Andriy; Milostan, Maciej; Szajkowski, Lukasz; Daniluk, Pawel; Fidelis, Krzysztof.

Proteins ; 61 Suppl 7: 19-23, 2005.

Article in English | MEDLINE | ID: mdl-16187343

ABSTRACT

We present a short overview of the system governing data processing and automatic evaluation of predictions in CASP6, implemented at the Livermore Protein Structure Prediction Center. The system incorporates interrelated facilities for registering participants, collecting prediction targets from crystallographers and NMR spectroscopists and making them available to the CASP6 participants, accepting predictions and providing their preliminary evaluation, and finally, storing and visualizing results. We have automatically evaluated predictions submitted to CASP6 using criteria and methods developed over the successive CASP experiments. Also, we have tested a new evaluation technique based on non-rigid-body type superpositions. Approximately the same number of predictions has been submitted to CASP6 as to all previous CASPs combined, making navigation through and understanding of the data particularly challenging. To facilitate this, we have substantially modernized all data handling procedures, including implementation of a dedicated relational database. An overview of our redesigned website is also presented (http://predictioncenter.org/casp6/).

Subject(s)

Computational Biology/methods , Proteins/chemistry , Proteomics/methods , Algorithms , Automation , Crystallography, X-Ray , Internet , Magnetic Resonance Spectroscopy , Models, Molecular , Protein Conformation , Protein Folding , Protein Structure, Secondary , Protein Structure, Tertiary , Software

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL