Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Cancer Cell ; 41(8): 1397-1406, 2023 08 14.
Article in English | MEDLINE | ID: mdl-37582339

ABSTRACT

The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) investigates tumors from a proteogenomic perspective, creating rich multi-omics datasets connecting genomic aberrations to cancer phenotypes. To facilitate pan-cancer investigations, we have generated harmonized genomic, transcriptomic, proteomic, and clinical data for >1000 tumors in 10 cohorts to create a cohesive and powerful dataset for scientific discovery. We outline efforts by the CPTAC pan-cancer working group in data harmonization, data dissemination, and computational resources for aiding biological discoveries. We also discuss challenges for multi-omics data integration and analysis, specifically the unique challenges of working with both nucleotide sequencing and mass spectrometry proteomics data.


Subject(s)
Neoplasms , Proteogenomics , Humans , Proteomics , Genomics , Neoplasms/genetics , Gene Expression Profiling
2.
Methods Mol Biol ; 2199: 209-236, 2021.
Article in English | MEDLINE | ID: mdl-33125653

ABSTRACT

Efficient and comprehensive data management is an indispensable component of modern scientific research and requires effective tools for all but the most trivial experiments. The LabDB system developed and used in our laboratory was originally designed to track the progress of a structure determination pipeline in several large National Institutes of Health (NIH) projects. While initially designed for structural biology experiments, its modular nature makes it easily applied in laboratories of various sizes in many experimental fields. Over many years, LabDB has transformed into a sophisticated system integrating a range of biochemical, biophysical, and crystallographic experimental data, which harvests data both directly from laboratory instruments and through human input via a web interface. The core module of the system handles many types of universal laboratory management data, such as laboratory personnel, chemical inventories, storage locations, and custom stock solutions. LabDB also tracks various biochemical experiments, including spectrophotometric and fluorescent assays, thermal shift assays, isothermal titration calorimetry experiments, and more. LabDB has been used to manage data for experiments that resulted in over 1200 deposits to the Protein Data Bank (PDB); the system is currently used by the Center for Structural Genomics of Infectious Diseases (CSGID) and several large laboratories. This chapter also provides examples of data mining analyses and warnings about incomplete and inconsistent experimental data. These features, together with its capabilities for detailed tracking, analysis, and auditing of experimental data, make the described system uniquely suited to inspect potential sources of irreproducibility in life sciences research.


Subject(s)
Computational Biology , Database Management Systems , Databases, Protein , Humans , Reproducibility of Results
3.
BMC Psychiatry ; 19(1): 221, 2019 Jul 16.
Article in English | MEDLINE | ID: mdl-31311510

ABSTRACT

Following publication of the original article [1], we have been notified that some important information was omitted by the authors from the Competing interests section. The declaration should read as below.

4.
BMC Evol Biol ; 18(1): 199, 2018 12 22.
Article in English | MEDLINE | ID: mdl-30577795

ABSTRACT

BACKGROUND: The family of D-isomer specific 2-hydroxyacid dehydrogenases (2HADHs) contains a wide range of oxidoreductases with various metabolic roles as well as biotechnological applications. Despite a vast amount of biochemical and structural data for various representatives of the family, the long and complex evolution and broad sequence diversity hinder functional annotations for uncharacterized members. RESULTS: We report an in-depth phylogenetic analysis, followed by mapping of available biochemical and structural data on the reconstructed phylogenetic tree. The analysis suggests that some subfamilies comprising enzymes with similar yet broad substrate specificity profiles diverged early in the evolution of 2HADHs. Based on the phylogenetic tree, we present a revised classification of the family that comprises 22 subfamilies, including 13 new subfamilies not studied biochemically. We summarize characteristics of the nine biochemically studied subfamilies by aggregating all available sequence, biochemical, and structural data, providing comprehensive descriptions of the active site, cofactor-binding residues, and potential roles of specific structural regions in substrate recognition. In addition, we concisely present our analysis as an online 2HADH enzymes knowledgebase. CONCLUSIONS: The knowledgebase enables navigation over the 2HADHs classification, search through collected data, and functional predictions of uncharacterized 2HADHs. Future characterization of the new subfamilies may result in discoveries of enzymes with novel metabolic roles and with properties beneficial for biotechnological applications.


Subject(s)
Alcohol Oxidoreductases/chemistry , Alcohol Oxidoreductases/classification , Knowledge Bases , Alcohol Oxidoreductases/metabolism , Amino Acid Sequence , Catalytic Domain , Coenzymes/metabolism , Likelihood Functions , Phylogeny , Substrate Specificity
5.
Proteomics Clin Appl ; 12(5): e1700069, 2018 09.
Article in English | MEDLINE | ID: mdl-28975713

ABSTRACT

PURPOSE: PepSweetener is a web-based visualization tool designed to facilitate the manual annotation of intact glycopeptides from MS data regardless of the instrument that produced these data. EXPERIMENTAL DESIGN: This exploratory tool uses a theoretical glycopeptide dataset to visualize all peptide-glycan combinations that fall within the error range of the query precursor ion. PepSweetener simplifies the determination of the correct peptide and glycan composition of a glycopeptide based on its precursor mass. The theoretical glycopeptide search space can be customized in an advanced query mode that specifies potential proteins/peptides, glycan compositions, and several experimental parameters. RESULTS: PepSweetener displays the results on an interactive heat-map chart where theoretical glycopeptide tile colors correspond to ppm deviations from the query precursor mass. Additionally, a visualization chart incorporates glycan composition filtering, sorting by mass and tolerance, and an in silico peptide fragmentation diagram is provided to further support the correct glycopeptide identification. CONCLUSIONS AND CLINICAL RELEVANCE: PepSweetener efficiently allows the selection of the most probable intact glycopeptide mass matches and speeds up the verification process. It is validated on serum protein samples and immunoglobulins. The tool is publicly hosted on ExPASy, the SIB Swiss Institute of Bioinformatics resource portal (http://glycoproteome.expasy.org/pepsweetener/app/).


Subject(s)
Glycopeptides/genetics , Molecular Sequence Annotation , Polysaccharides/genetics , Proteomics , Amino Acid Sequence/genetics , Glycopeptides/chemistry , Glycosylation , Humans , Internet , Polysaccharides/chemistry , Software , Tandem Mass Spectrometry
6.
Nucleic Acids Res ; 45(D1): D177-D182, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899619

ABSTRACT

The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.


Subject(s)
Databases, Protein , Proteomics , Genetic Association Studies , Genetic Variation , Humans , Internet , Phenotype , Proteomics/methods , Software , Web Browser
7.
Methods Mol Biol ; 1140: 1-25, 2014.
Article in English | MEDLINE | ID: mdl-24590705

ABSTRACT

Modern high-throughput structural biology laboratories produce vast amounts of raw experimental data. The traditional method of data reduction is very simple-results are summarized in peer-reviewed publications, which are hopefully published in high-impact journals. By their nature, publications include only the most important results derived from experiments that may have been performed over the course of many years. The main content of the published paper is a concise compilation of these data, an interpretation of the experimental results, and a comparison of these results with those obtained by other scientists.Due to an avalanche of structural biology manuscripts submitted to scientific journals, in many recent cases descriptions of experimental methodology (and sometimes even experimental results) are pushed to supplementary materials that are only published online and sometimes may not be reviewed as thoroughly as the main body of a manuscript. Trouble may arise when experimental results are contradicting the results obtained by other scientists, which requires (in the best case) the reexamination of the original raw data or independent repetition of the experiment according to the published description of the experiment. There are reports that a significant fraction of experiments obtained in academic laboratories cannot be repeated in an industrial environment (Begley CG & Ellis LM, Nature 483(7391):531-3, 2012). This is not an indication of scientific fraud but rather reflects the inadequate description of experiments performed on different equipment and on biological samples that were produced with disparate methods. For that reason the goal of a modern data management system is not only the simple replacement of the laboratory notebook by an electronic one but also the creation of a sophisticated, internally consistent, scalable data management system that will combine data obtained by a variety of experiments performed by various individuals on diverse equipment. All data should be stored in a core database that can be used by custom applications to prepare internal reports, statistics, and perform other functions that are specific to the research that is pursued in a particular laboratory.This chapter presents a general overview of the methods of data management and analysis used by structural genomics (SG) programs. In addition to a review of the existing literature on the subject, also presented is experience in the development of two SG data management systems, UniTrack and LabDB. The description is targeted to a general audience, as some technical details have been (or will be) published elsewhere. The focus is on "data management," meaning the process of gathering, organizing, and storing data, but also briefly discussed is "data mining," the process of analysis ideally leading to an understanding of the data. In other words, data mining is the conversion of data into information. Clearly, effective data management is a precondition for any useful data mining. If done properly, gathering details on millions of experiments on thousands of proteins and making them publicly available for analysis-even after the projects themselves have ended-may turn out to be one of the most important benefits of SG programs.


Subject(s)
Biomedical Research/methods , High-Throughput Screening Assays/methods , Molecular Biology/methods , Computational Biology , Humans , Knowledge Management , Peer Review, Research
8.
Acta Crystallogr D Biol Crystallogr ; 70(Pt 2): 481-91, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24531482

ABSTRACT

Validation of general ideas about the origins of conformational differences in proteins is critical in order to arrive at meaningful functional insights. Here, principal component analysis (PCA) and distance difference matrices are used to validate some such ideas about the conformational differences between 291 myoglobin structures from sperm whale, horse and pig. Almost all of the horse and pig structures form compact PCA clusters with only minor coordinate differences and outliers that are easily explained. The 222 whale structures form a few dense clusters with multiple outliers. A few whale outliers with a prominent distortion of the GH loop are very similar to the cluster of horse structures, which all have a similar GH-loop distortion apparently owing to intermolecular crystal lattice hydrogen bonds to the GH loop from residues near the distal histidine His64. The variations of the GH-loop coordinates in the whale structures are likely to be owing to the observed alternative intermolecular crystal lattice bond, with the change to the GH loop distorting bonds correlated with the binding of specific `unusual' ligands. Such an alternative intermolecular bond is not observed in horse myoglobins, obliterating any correlation with the ligands. Intermolecular bonds do not usually cause significant coordinate differences and cannot be validated as their universal cause. Most of the native-like whale myoglobin structure outliers can be correlated with a few specific factors. However, these factors do not always lead to coordinate differences beyond the previously determined uncertainty thresholds. The binding of unusual ligands by myoglobin, leading to crystal-induced distortions, suggests that some of the conformational differences between the apo and holo structures might not be `functionally important' but rather artifacts caused by the binding of `unusual' substrate analogs. The causes of P6 symmetry in myoglobin crystals and the relationship between crystal and solution structures are also discussed.


Subject(s)
Apoproteins/chemistry , Myoglobin/chemistry , Principal Component Analysis , Spermatozoa/chemistry , Animals , Apoproteins/genetics , Crystallography, X-Ray , Horses , Hydrogen Bonding , Ligands , Male , Mutation , Myoglobin/genetics , Protein Binding , Protein Conformation , Species Specificity , Swine , Whales
9.
Methods Mol Biol ; 1091: 297-314, 2014.
Article in English | MEDLINE | ID: mdl-24203341

ABSTRACT

Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein-ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement.


Subject(s)
Protein Conformation , Proteins/chemistry , Proteomics/methods , Proteomics/standards , Data Mining , Databases, Protein , Ligands , Macromolecular Substances/chemistry , Models, Molecular , Nuclear Magnetic Resonance, Biomolecular , Protein Binding , Proteins/metabolism , Quality Control , Reproducibility of Results
10.
Acta Crystallogr D Biol Crystallogr ; 69(Pt 3): 464-70, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23519421

ABSTRACT

While small organic molecules generally crystallize forming tightly packed lattices with little solvent content, proteins form air-sensitive high-solvent-content crystals. Here, the crystallization and full structure analysis of a novel recombinant 10 kDa protein corresponding to the C-terminal domain of a putative U32 peptidase are reported. The orthorhombic crystal contained only 24.5% solvent and is therefore among the most tightly packed protein lattices ever reported.


Subject(s)
Geobacillus/enzymology , Peptide Hydrolases/chemistry , Crystallization , Crystallography, X-Ray , Molecular Weight , Peptide Fragments/chemistry , Proteolysis , Selenomethionine/metabolism , Solvents
11.
Curr Opin Struct Biol ; 20(5): 587-97, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20810277

ABSTRACT

Structural genomics (SG) programs have developed during the last decade many novel methodologies for faster and more accurate structure determination. These new tools and approaches led to the determination of thousands of protein structures. The generation of enormous amounts of experimental data resulted in significant improvements in the understanding of many biological processes at molecular levels. However, the amount of data collected so far is so large that traditional analysis methods are limiting the rate of extraction of biological and biochemical information from 3D models. This situation has prompted us to review the challenges that remain unmet by SG, as well as the areas in which the potential impact of SG could exceed what has been achieved so far.


Subject(s)
Genomics/methods , Animals , Crystallization , Drug Discovery , Humans , Models, Molecular , Proteins/chemistry , Proteins/genetics , Proteins/isolation & purification , Proteins/metabolism , Sequence Homology, Amino Acid
12.
Article in English | MEDLINE | ID: mdl-20663480

ABSTRACT

Recent years have brought not only an avalanche of new macromolecular structures, but also significant advances in the protein structure determination methodology only now making its way into structure-based drug discovery. In this chapter, we review recent methodology developments in X-ray diffraction experiments that led to fast and very accurate elucidation of three-dimensional structures of macromolecules. We will discuss the role of data collection as the last experiment performed in the crystal structure determination process. A statistical analysis of diffraction experiments that are reported in the Protein Data Bank (PDB) is also presented.


Subject(s)
Proteins/chemistry , Protein Conformation , X-Ray Diffraction
SELECTION OF CITATIONS
SEARCH DETAIL
...