Search | VHL Regional Portal

1.

Cancer3D 2.0: interactive analysis of 3D patterns of cancer mutations in cancer subsets.

Sedova, Mayya; Iyer, Mallika; Li, Zhanwen; Jaroszewski, Lukasz; Post, Kai W; Hrabe, Thomas; Porta-Pardo, Eduard; Godzik, Adam.

Nucleic Acids Res ; 47(D1): D895-D899, 2019 01 08.

Article in English | MEDLINE | ID: mdl-30407596

ABSTRACT

Our knowledge of cancer genomics exploded in last several years, providing us with detailed knowledge of genetic alterations in almost all cancer types. Analysis of this data gave us new insights into molecular aspects of cancer, most important being the amazing diversity of molecular abnormalities in individual cancers. The most important question in cancer research today is how to classify this diversity to identify subtypes that are most relevant for treatment and outcome prediction for individual patients. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins they are found in and in relation to patients' clinical data. This approach allows users to find novel candidate driver regions for specific subgroups, that often cannot be found when similar analyses are done on the whole gene level and for large, diverse cohorts. Interactive interface allows user to visualize the distribution of mutations in subgroups defined by cancer type and stage, gender and age brackets, patient's ethnicity or vice versa find dominant cancer type, gender or age groups for specific three-dimensional mutation patterns.

Subject(s)

Databases, Protein , Mutation, Missense , Neoplasms/genetics , Protein Conformation , Proteins/genetics , Humans , Protein Domains

2.

Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework.

Glusman, Gustavo; Rose, Peter W; Prlic, Andreas; Dougherty, Jennifer; Duarte, José M; Hoffman, Andrew S; Barton, Geoffrey J; Bendixen, Emøke; Bergquist, Timothy; Bock, Christian; Brunk, Elizabeth; Buljan, Marija; Burley, Stephen K; Cai, Binghuang; Carter, Hannah; Gao, JianJiong; Godzik, Adam; Heuer, Michael; Hicks, Michael; Hrabe, Thomas; Karchin, Rachel; Leman, Julia Koehler; Lane, Lydie; Masica, David L; Mooney, Sean D; Moult, John; Omenn, Gilbert S; Pearl, Frances; Pejaver, Vikas; Reynolds, Sheila M; Rokem, Ariel; Schwede, Torsten; Song, Sicheng; Tilgner, Hagen; Valasatava, Yana; Zhang, Yang; Deutsch, Eric W.

Genome Med ; 9(1): 113, 2017 Dec 18.

Article in English | MEDLINE | ID: mdl-29254494

ABSTRACT

The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.

Subject(s)

Genome-Wide Association Study/methods , Polymorphism, Genetic , Protein Conformation , Sequence Analysis, Protein/methods , Algorithms , Congresses as Topic , Genome-Wide Association Study/standards , Humans , Sequence Analysis, Protein/standards

3.

Revealing aperiodic aspects of solenoid proteins from sequence information.

Hrabe, Thomas; Jaroszewski, Lukasz; Godzik, Adam.

Bioinformatics ; 32(18): 2776-82, 2016 09 15.

Article in English | MEDLINE | ID: mdl-27334472

ABSTRACT

MOTIVATION: Repeat proteins, which contain multiple repeats of short sequence motifs, form a large but seldom-studied group of proteins. Methods focusing on the analysis of 3D structures of such proteins identified many subtle effects in length distribution of individual motifs that are important for their functions. However, similar analysis was yet not applied to the vast majority of repeat proteins with unknown 3D structures, mostly because of the extreme diversity of the underlying motifs and the resulting difficulty to detect those. RESULTS: We developed FAIT, a sequence-based algorithm for the precise assignment of individual repeats in repeat proteins and introduced a framework to classify and compare aperiodicity patterns for large protein families. FAIT extracts repeat positions by post-processing FFAS alignment matrices with image processing methods. On examples of proteins with Leucine Rich Repeat (LRR) domains and other solenoids like proteins, we show that the automated analysis with FAIT correctly identifies exact lengths of individual repeats based entirely on sequence information. AVAILABILITY AND IMPLEMENTATION: https://github.com/GodzikLab/FAIT CONTACT: adam@godziklab.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Algorithms , Proteins , Amino Acid Motifs , Repetitive Sequences, Amino Acid , Sequence Analysis, Protein

4.

PDBFlex: exploring flexibility in protein structures.

Hrabe, Thomas; Li, Zhanwen; Sedova, Mayya; Rotkiewicz, Piotr; Jaroszewski, Lukasz; Godzik, Adam.

Nucleic Acids Res ; 44(D1): D423-8, 2016 Jan 04.

Article in English | MEDLINE | ID: mdl-26615193

ABSTRACT

The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software.

Subject(s)

Databases, Protein , Protein Conformation , Ligands , Models, Molecular

5.

A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces.

Porta-Pardo, Eduard; Garcia-Alonso, Luz; Hrabe, Thomas; Dopazo, Joaquin; Godzik, Adam.

PLoS Comput Biol ; 11(10): e1004518, 2015 Oct.

Article in English | MEDLINE | ID: mdl-26485003

ABSTRACT

Despite their importance in maintaining the integrity of all cellular pathways, the role of mutations on protein-protein interaction (PPI) interfaces as cancer drivers has not been systematically studied. Here we analyzed the mutation patterns of the PPI interfaces from 10,028 proteins in a pan-cancer cohort of 5,989 tumors from 23 projects of The Cancer Genome Atlas (TCGA) to find interfaces enriched in somatic missense mutations. To that end we use e-Driver, an algorithm to analyze the mutation distribution of specific protein functional regions. We identified 103 PPI interfaces enriched in somatic cancer mutations. 32 of these interfaces are found in proteins coded by known cancer driver genes. The remaining 71 interfaces are found in proteins that have not been previously identified as cancer drivers even that, in most cases, there is an extensive literature suggesting they play an important role in cancer. Finally, we integrate these findings with clinical information to show how tumors apparently driven by the same gene have different behaviors, including patient outcomes, depending on which specific interfaces are mutated.

Subject(s)

DNA Mutational Analysis/methods , Neoplasm Proteins/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide/genetics , Protein Interaction Mapping/methods , Signal Transduction/genetics , Animals , Base Sequence , Biomarkers, Tumor/genetics , Catalogs as Topic , Chromosome Mapping , Computer Simulation , Genetic Predisposition to Disease/genetics , Humans , Models, Genetic , Molecular Sequence Data , Mutation/genetics

6.

Localize.pytom: a modern webserver for cryo-electron tomography.

Hrabe, Thomas.

Nucleic Acids Res ; 43(W1): W231-6, 2015 Jul 01.

Article in English | MEDLINE | ID: mdl-25934806

ABSTRACT

Localize.pytom, available through http://localize.pytom.org is a webserver for the localize module in the PyTom package. It is a free website and open to all users and there is no login requirement. The server accepts tomograms as they are imaged and reconstructed by Cryo-Electron Tomography (CET) and returns densities and coordinates of candidate-macromolecules in the tomogram. Localization of macromolecules in cryo-electron tomograms is one of the key procedures to unravel structural features of imaged macromolecules. Positions of localized molecules are further used for structural analysis by single particle procedures such as fine alignment, averaging and classification. Accurate localization can be furthermore used to generate molecular atlases of whole cells. Localization uses a cross-correlation-based score and requires a reference volume as input. A reference can either be a previously detected macromolecular structure or extrapolated on the server from a specific PDB chain. Users have the option to use either coarse or fine angular sampling strategies based on uniformly distributed rotations and to accurately compensate for the CET common 'Missing Wedge' artefact during sampling. After completion, all candidate macromolecules cut out from the tomogram are available for download. Their coordinates are stored and available in XML format, which can be easily integrated into successive analysis steps in other software. A pre-computed average of the first one hundred macromolecules is also available for immediate download, and the user has the option to further analyse the average, based on the detected score distribution in a novel web-density viewer.

Subject(s)

Electron Microscope Tomography/methods , Macromolecular Substances/chemistry , Software , Cryoelectron Microscopy , Imaging, Three-Dimensional , Internet , Macromolecular Substances/ultrastructure

7.

Cancer3D: understanding cancer mutations through protein structures.

Porta-Pardo, Eduard; Hrabe, Thomas; Godzik, Adam.

Nucleic Acids Res ; 43(Database issue): D968-73, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25392415

ABSTRACT

The new era of cancer genomics is providing us with extensive knowledge of mutations and other alterations in cancer. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins in which they are found. The database also helps users analyze the distribution patterns of the mutations as well as their relationship to changes in drug activity through two algorithms: e-Driver and e-Drug. These algorithms use knowledge of modular structure of genes and proteins to separately study each region. This approach allows users to find novel candidate driver regions or drug biomarkers that cannot be found when similar analyses are done on the whole-gene level. The Cancer3D database provides access to the results of such analyses based on data from The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE). In addition, it displays mutations from over 14,700 proteins mapped to more than 24,300 structures from PDB. This helps users visualize the distribution of mutations and identify novel three-dimensional patterns in their distribution.

Subject(s)

Databases, Protein , Mutation, Missense , Neoplasm Proteins/chemistry , Neoplasm Proteins/genetics , Antineoplastic Agents/pharmacology , Biomarkers, Tumor/analysis , Internet , Protein Conformation , Protein Isoforms/genetics , Protein Isoforms/metabolism

8.

POSA: a user-driven, interactive multiple protein structure alignment server.

Li, Zhanwen; Natarajan, Padmaja; Ye, Yuzhen; Hrabe, Thomas; Godzik, Adam.

Nucleic Acids Res ; 42(Web Server issue): W240-5, 2014 Jul.

Article in English | MEDLINE | ID: mdl-24838569

ABSTRACT

POSA (Partial Order Structure Alignment), available at http://posa.godziklab.org, is a server for multiple protein structure alignment introduced in 2005 (Ye,Y. and Godzik,A. (2005) Multiple flexible structure alignment using partial order graphs. Bioinformatics, 21, 2362-2369). It is free and open to all users, and there is no login requirement, albeit there is an option to register and store results in individual, password-protected directories. In the updated POSA server described here, we introduce two significant improvements. First is an interface allowing the user to provide additional information by defining segments that anchor the alignment in one or more input structures. This interface allows users to take advantage of their intuition and biological insights to improve the alignment and guide it toward a biologically relevant solution. The second improvement is an interactive visualization with options that allow the user to view all superposed structures in one window (a typical solution for visualizing results of multiple structure alignments) or view them individually in a series of synchronized windows with extensive, user-controlled visualization options. The user can rotate structure(s) in any of the windows and study similarities or differences between structures clearly visible in individual windows.

Subject(s)

Software , Structural Homology, Protein , Algorithms , Internet , User-Computer Interface

9.

ConSole: using modularity of contact maps to locate solenoid domains in protein structures.

Hrabe, Thomas; Godzik, Adam.

BMC Bioinformatics ; 15: 119, 2014 Apr 27.

Article in English | MEDLINE | ID: mdl-24766872

ABSTRACT

BACKGROUND: Periodic proteins, characterized by the presence of multiple repeats of short motifs, form an interesting and seldom-studied group. Due to often extreme divergence in sequence, detection and analysis of such motifs is performed more reliably on the structural level. Yet, few algorithms have been developed for the detection and analysis of structures of periodic proteins. RESULTS: ConSole recognizes modularity in protein contact maps, allowing for precise identification of repeats in solenoid protein structures, an important subgroup of periodic proteins. Tests on benchmarks show that ConSole has higher recognition accuracy as compared to Raphael, the only other publicly available solenoid structure detection tool. As a next step of ConSole analysis, we show how detection of solenoid repeats in structures can be used to improve sequence recognition of these motifs and to detect subtle irregularities of repeat lengths in three solenoid protein families. CONCLUSIONS: The ConSole algorithm provides a fast and accurate tool to recognize solenoid protein structures as a whole and to identify individual solenoid repeat units from a structure. ConSole is available as a web-based, interactive server and is available for download at http://console.sanfordburnham.org.

Subject(s)

Protein Structure, Tertiary , Software , Algorithms , Leucine-Rich Repeat Proteins , Protein Conformation , Proteins/chemistry , Repetitive Sequences, Amino Acid

10.

Fast and accurate reference-free alignment of subtomograms.

Chen, Yuxiang; Pfeffer, Stefan; Hrabe, Thomas; Schuller, Jan Michael; Förster, Friedrich.

J Struct Biol ; 182(3): 235-45, 2013 Jun.

Article in English | MEDLINE | ID: mdl-23523719

ABSTRACT

In cryoelectron tomography alignment and averaging of subtomograms, each dnepicting the same macromolecule, improves the resolution compared to the individual subtomogram. Major challenges of subtomogram alignment are noise enhancement due to overfitting, the bias of an initial reference in the iterative alignment process, and the computational cost of processing increasingly large amounts of data. Here, we propose an efficient and accurate alignment algorithm via a generalized convolution theorem, which allows computation of a constrained correlation function using spherical harmonics. This formulation increases computational speed of rotational matching dramatically compared to rotation search in Cartesian space without sacrificing accuracy in contrast to other spherical harmonic based approaches. Using this sampling method, a reference-free alignment procedure is proposed to tackle reference bias and overfitting, which also includes contrast transfer function correction by Wiener filtering. Application of the method to simulated data allowed us to obtain resolutions near the ground truth. For two experimental datasets, ribosomes from yeast lysate and purified 20S proteasomes, we achieved reconstructions of approximately 20Å and 16Å, respectively. The software is ready-to-use and made public to the community.

Subject(s)

Cryoelectron Microscopy/methods , Electron Microscope Tomography , Image Processing, Computer-Assisted , Algorithms , Imaging, Three-Dimensional , Proteasome Endopeptidase Complex/ultrastructure , Ribosomes/ultrastructure , Software , Yeasts/ultrastructure

11.

Structure and 3D arrangement of endoplasmic reticulum membrane-associated ribosomes.

Pfeffer, Stefan; Brandt, Florian; Hrabe, Thomas; Lang, Sven; Eibauer, Matthias; Zimmermann, Richard; Förster, Friedrich.

Structure ; 20(9): 1508-18, 2012 Sep 05.

Article in English | MEDLINE | ID: mdl-22819217

ABSTRACT

In eukaryotic cells, cotranslational protein translocation across the endoplasmic reticulum (ER) membrane requires an elaborate macromolecular machinery. While structural details of ribosomes bound to purified and solubilized constituents of the translocon have been elucidated in recent years, little structural knowledge of ribosomes bound to the complete ER protein translocation machinery in a native membrane environment exists. Here, we used cryoelectron tomography to provide a three-dimensional reconstruction of 80S ribosomes attached to functional canine pancreatic ER microsomes in situ. In the resulting subtomogram average at 31 Å resolution, we observe direct contact of ribosomal expansion segment ES27L and the membrane and distinguish several membrane-embedded and lumenal complexes, including Sec61, the TRAP complex and another large complex protruding 90 Å into the lumen. Membrane-associated ribosomes adopt a preferred three-dimensional arrangement that is likely specific for ER-associated polyribosomes and may explain the high translation efficiency of ER-associated ribosomes compared to their cytosolic counterparts.

Subject(s)

Endoplasmic Reticulum, Rough/ultrastructure , Intracellular Membranes/ultrastructure , Ribosomes/ultrastructure , Animals , Cryoelectron Microscopy , Dogs , Electron Microscope Tomography , Microsomes/ultrastructure , Models, Molecular , Pancreas/cytology

12.

PyTom: a python-based toolbox for localization of macromolecules in cryo-electron tomograms and subtomogram analysis.

Hrabe, Thomas; Chen, Yuxiang; Pfeffer, Stefan; Cuellar, Luis Kuhn; Mangold, Ann-Victoria; Förster, Friedrich.

J Struct Biol ; 178(2): 177-88, 2012 May.

Article in English | MEDLINE | ID: mdl-22193517

ABSTRACT

Cryo-electron tomography (CET) is a three-dimensional imaging technique for structural studies of macromolecules under close-to-native conditions. In-depth analysis of macromolecule populations depicted in tomograms requires identification of subtomograms corresponding to putative particles, averaging of subtomograms to enhance their signal, and classification to capture the structural variations among them. Here, we introduce the open-source platform PyTom that unifies standard tomogram processing steps in a python toolbox. For subtomogram averaging, we implemented an adaptive adjustment of scoring and sampling that clearly improves the resolution of averages compared to static strategies. Furthermore, we present a novel stochastic classification method that yields significantly more accurate classification results than two deterministic approaches in simulations. We demonstrate that the PyTom workflow yields faithful results for alignment and classification of simulated and experimental subtomograms of ribosomes and GroEL(14)/GroEL(14)GroES(7), respectively, as well as for the analysis of ribosomal 60S subunits in yeast cell lysate. PyTom enables parallelized processing of large numbers of tomograms, but also provides a convenient, sustainable environment for algorithmic development.

Subject(s)

Cryoelectron Microscopy/methods , Electron Microscope Tomography/methods , Image Processing, Computer-Assisted/methods , Macromolecular Substances , Animals , Imaging, Three-Dimensional/methods

13.

Size distribution of native cytosolic proteins of Thermoplasma acidophilum.

Sun, Na; Tamura, Noriko; Tamura, Tomohiro; Knispel, Roland Wilhelm; Hrabe, Thomas; Kofler, Christine; Nickell, Stephan; Nagy, István.

Proteomics ; 9(14): 3783-6, 2009 Jul.

Article in English | MEDLINE | ID: mdl-19639595

ABSTRACT

We used molecular sieve chromatography in combination with LC-MS/MS to identify protein complexes that can serve as templates in the template matching procedures of visual proteomics approaches. By this method the sample complexity was lowered sufficiently to identify 464 proteins and - on the basis of size distribution and bioinformatics analysis - 189 of them could be assigned as subunits of macromolecular complexes over the size of 300 kDa. From these we purified six stable complexes of Thermoplasma acidophilum whose size and subunit composition - analyzed by electron microscopy and MALDI-TOF-MS, respectively - verified the accuracy of our method.

Subject(s)

Archaeal Proteins/metabolism , Cytosol/metabolism , Thermoplasma/metabolism , Chromatography, Gel , Chromatography, Liquid , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Tandem Mass Spectrometry

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL