Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 90
Filter
1.
Methods Mol Biol ; 2836: 77-96, 2024.
Article in English | MEDLINE | ID: mdl-38995537

ABSTRACT

Glycosylation is a unique posttranslational modification that dynamically shapes the surface of cells. Glycans attached to proteins or lipids in a cell or tissue are studied as a whole and collectively designated as a glycome. UniCarb-DB is a glycomic spectral library of tandem mass spectrometry (MS/MS) fragment data. The current version of the database consists of over 1500 entries and over 1000 unique structures. Each entry contains parent ion information with associated MS/MS spectra, metadata about the original publication, experimental conditions, and biological origin. Each structure is also associated with the GlyTouCan glycan structure repository allowing easy access to other glycomic resources. The database can be directly utilized by mass spectrometry (MS) experimentalists through the conversion of data generated by MS into structural information. Flexible online search tools along with a downloadable version of the database are easily incorporated in either commercial or open-access MS software. This chapter highlights UniCarb-DB online search tool to browse differences of isomeric structures between spectra, a peak matching search between user-generated MS/MS spectra and spectra stored in UniCarb-DB and more advanced MS tools for combined quantitative and qualitative glycomics.


Subject(s)
Glycomics , Polysaccharides , Software , Tandem Mass Spectrometry , Tandem Mass Spectrometry/methods , Glycomics/methods , Polysaccharides/chemistry , Polysaccharides/analysis , Databases, Factual , Glycosylation , Humans
2.
Anal Bioanal Chem ; 416(16): 3687-3696, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38748247

ABSTRACT

Glycans participate in a vast number of recognition systems in diverse organisms in health and in disease. However, glycans cannot be sequenced because there is no sequencer technology that can fully characterize them. There is no "template" for replicating glycans as there are for amino acids and nucleic acids. Instead, glycans are synthesized by a complicated orchestration of multitudes of glycosyltransferases and glycosidases. Thus glycans can vary greatly in structure, but they are not genetically reproducible and are usually isolated in minute amounts. To characterize (sequence) the glycome (defined as the glycans in a particular organism, tissue, cell, or protein), glycosylation pathway prediction using in silico methods based on glycogene expression data, and glycosylation simulations have been attempted. Since many of the mammalian glycogenes have been identified and cloned, it has become possible to predict the glycan biosynthesis pathway in these systems. By then incorporating systems biology and bioprocessing technologies to these pathway models, given the right enzymatic parameters including enzyme and substrate concentrations and kinetic reaction parameters, it is possible to predict the potentially synthesized glycans in the pathway. This review presents information on the data resources that are currently available to enable in silico simulations of glycosylation and related pathways. Then some of the software tools that have been developed in the past to simulate and analyze glycosylation pathways will be described, followed by a summary and vision for the future developments and research directions in this area.


Subject(s)
Computer Simulation , Polysaccharides , Glycosylation , Polysaccharides/metabolism , Polysaccharides/chemistry , Animals , Humans , Software , Glycosyltransferases/metabolism
3.
J Biol Chem ; 300(2): 105624, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38176651

ABSTRACT

The glycosylation of proteins and lipids is known to be closely related to the mechanisms of various diseases such as influenza, cancer, and muscular dystrophy. Therefore, it has become clear that the analysis of post-translational modifications of proteins, including glycosylation, is important to accurately understand the functions of each protein molecule and the interactions among them. In order to conduct large-scale analyses more efficiently, it is essential to promote the accumulation, sharing, and reuse of experimental and analytical data in accordance with the FAIR (Findability, Accessibility, Interoperability, and Re-usability) data principles. However, a FAIR data repository for storing and sharing glycoconjugate information, including glycopeptides and glycoproteins, in a standardized format did not exist. Therefore, we have developed GlyComb (https://glycomb.glycosmos.org) as a new standardized data repository for glycoconjugate data. Currently, GlyComb can assign a unique identifier to a set of glycosylation information associated with a specific peptide sequence or UniProt ID. By standardizing glycoconjugate data via GlyComb identifiers and coordinating with existing web resources such as GlyTouCan and GlycoPOST, a comprehensive system for data submission and data sharing among researchers can be established. Here we introduce how GlyComb is able to integrate the variety of glycoconjugate data already registered in existing data repositories to obtain a better understanding of the available glycopeptides and glycoproteins, and their glycosylation patterns. We also explain how this system can serve as a foundation for a better understanding of glycan function.


Subject(s)
Databases, Chemical , Glycomics , Proteomics , Glycopeptides/metabolism , Glycoproteins/metabolism , Glycosylation , Polysaccharides/metabolism , Databases, Genetic
4.
Sci Rep ; 14(1): 489, 2024 01 04.
Article in English | MEDLINE | ID: mdl-38177192

ABSTRACT

N-glycosylation is an abundant post-translational modification of most cell-surface proteins. N-glycans play a crucial role in cellular functions like protein folding, protein localization, cell-cell signaling, and immune detection. As different tissue types display different N-glycan profiles, changes in N-glycan compositions occur in tissue-specific ways with development of disease, like cancer. However, no comparative atlas resource exists for documenting N-glycome alterations across various human tissue types, particularly comparing normal and cancerous tissues. In order to study a broad range of human tissue N-glycomes, N-glycan targeted MALDI imaging mass spectrometry was applied to custom formalin-fixed paraffin-embedded tissue microarrays. These encompassed fifteen human tissue types including bladder, breast, cervix, colon, esophagus, gastric, kidney, liver, lung, pancreas, prostate, sarcoma, skin, thyroid, and uterus. Each array contained both normal and tumor cores from the same pathology block, selected by a pathologist, allowing more in-depth comparisons of the N-glycome differences between tumor and normal and across tissue types. Using established MALDI-IMS workflows and existing N-glycan databases, the N-glycans present in each tissue core were spatially profiled and peak intensity data compiled for comparative analyses. Further structural information was determined for core fucosylation using endoglycosidase F3, and differentiation of sialic acid linkages through stabilization chemistry. Glycan structural differences across the tissue types were compared for oligomannose levels, branching complexity, presence of bisecting N-acetylglucosamine, fucosylation, and sialylation. Collectively, our research identified the N-glycans that were significantly increased and/or decreased in relative abundance in cancer for each tissue type. This study offers valuable information on a wide scale for both normal and cancerous tissues, serving as a reference for future studies and potential diagnostic applications of MALDI-IMS.


Subject(s)
Protein Processing, Post-Translational , Sarcoma , Male , Female , Humans , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Glycosylation , Polysaccharides/metabolism
5.
Mol Genet Metab Rep ; 37: 101016, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38053926

ABSTRACT

Rare diseases are estimated to affect 3.5%-5.9% of the population worldwide and are difficult to diagnose. Genome analysis is useful for diagnosis. However, since some variants, especially missense variants, are also difficult to interpret, tools to accurately predict the effect of missense variants are very important and needed. Here we developed a method, "VarMeter", to predict whether a missense variant is damaging based on Gibbs free energy and solvent-accessible surface area calculated from the AlphaFold 3D protein model. We applied this method to the whole-exome sequencing data of 900 individuals with rare or undiagnosed disease in our in-house database, and identified four who were hemizygous for missense variants of arylsulfatase L (ARSL; known as the genetic cause of chondrodysplasia punctata 1, CPDX1). Two individuals had a novel Ser89 to Asn (Ser89Asn) or Arg469 to Trp (Arg469Trp) substitution, respectively predicted as "damaging" or "benign"; the other two had an Arg111 to His (Arg111His) or Gly117 to Arg (Gly117Arg) substitution, respectively predicted as "damaging" or "possibly damaging" and previously reported in patients showing clinical manifestations of CDPX1. Expression and analysis of the missense variant proteins showed that the predicted pathogenic variants (Ser89Asn, Arg111His, and Gly117Arg) had complete loss of sulfatase activity and reduced protease resistance due to destabilization of protein structure, while the predicted benign variant (Arg469Trp) had activity and protease resistance comparable to those of wild-type ARSL. The individual with the novel pathogenic Ser89Asn variant exhibited characteristics of CDPX1, while the individual with the benign Arg469Trp variant exhibited no such characteristics. These findings demonstrate that VarMeter may be used to predict the deleteriousness of variants found in genome sequencing data and thereby support disease diagnosis.

6.
Sci Data ; 10(1): 582, 2023 09 06.
Article in English | MEDLINE | ID: mdl-37673902

ABSTRACT

Glycans are known to play extremely important roles in infections by viruses and pathogens. In fact, the SARS-CoV-2 virus has been shown to have evolved due to a single change in glycosylation. However, data resources on glycans, pathogens and diseases are not well organized. To accurately obtain such information from these various resources, we have constructed a foundation for discovering glycan and virus interaction data using Semantic Web technologies to be able to semantically integrate such heterogeneous data. Here, we created an ontology to encapsulate the semantics of virus-glycan interactions, and used Resource Description Framework (RDF) to represent the data we obtained from non-RDF related databases and data associated with literature. These databases include PubChem, SugarBind, and PSICQUIC, which made it possible to refer to other RDF resources such as UniProt and GlyTouCan. We made these data publicly available as open data and provided a service that allows anyone to freely perform searches using SPARQL. In addition, the RDF resources created in this study are available at the GlyCosmos Portal.


Subject(s)
COVID-19 , Humans , Databases, Factual , Glycosylation , Polysaccharides , SARS-CoV-2
7.
Glycobiology ; 33(6): 454-463, 2023 06 21.
Article in English | MEDLINE | ID: mdl-37129482

ABSTRACT

The GlyCosmos Glycoscience Portal (https://glycosmos.org) and PubChem (https://pubchem.ncbi.nlm.nih.gov/) are major portals for glycoscience and chemistry, respectively. GlyCosmos is a portal for glycan-related repositories, including GlyTouCan, GlycoPOST, and UniCarb-DR, as well as for glycan-related data resources that have been integrated from a variety of 'omics databases. Glycogenes, glycoproteins, lectins, pathways, and disease information related to glycans are accessible from GlyCosmos. PubChem, on the other hand, is a chemistry-based portal at the National Center for Biotechnology Information. PubChem provides information not only on chemicals, but also genes, proteins, pathways, as well as patents, bioassays, and more, from hundreds of data resources from around the world. In this work, these 2 portals have made substantial efforts to integrate their complementary data to allow users to cross between these 2 domains. In addition to glycan structures, key information, such as glycan-related genes, relevant diseases, glycoproteins, and pathways, was integrated and cross-linked with one another. The interfaces were designed to enable users to easily find, access, download, and reuse data of interest across these resources. Use cases are described illustrating and highlighting the type of content that can be investigated. In total, these integrations provide life science researchers improved awareness and enhanced access to glycan-related information.


Subject(s)
Databases, Chemical , Polysaccharides , Glycosylation , Workflow , Informatics , Polysaccharides/chemistry , Glycoconjugates/chemistry
8.
Glycobiology ; 33(5): 411-422, 2023 06 03.
Article in English | MEDLINE | ID: mdl-37067908

ABSTRACT

Protein N-linked glycosylation is an important post-translational mechanism in Homo sapiens, playing essential roles in many vital biological processes. It occurs at the N-X-[S/T] sequon in amino acid sequences, where X can be any amino acid except proline. However, not all N-X-[S/T] sequons are glycosylated; thus, the N-X-[S/T] sequon is a necessary but not sufficient determinant for protein glycosylation. In this regard, computational prediction of N-linked glycosylation sites confined to N-X-[S/T] sequons is an important problem that has not been extensively addressed by the existing methods, especially in regard to the creation of negative sets and leveraging the distilled information from protein language models (pLMs). Here, we developed LMNglyPred, a deep learning-based approach, to predict N-linked glycosylated sites in human proteins using embeddings from a pre-trained pLM. LMNglyPred produces sensitivity, specificity, Matthews Correlation Coefficient, precision, and accuracy of 76.50, 75.36, 0.49, 60.99, and 75.74 percent, respectively, on a benchmark-independent test set. These results demonstrate that LMNglyPred is a robust computational tool to predict N-linked glycosylation sites confined to the N-X-[S/T] sequon.


Subject(s)
Amino Acids , Glycoproteins , Humans , Glycosylation , Glycoproteins/metabolism , Amino Acids/chemistry , Protein Processing, Post-Translational , Amino Acid Sequence
9.
JACS Au ; 3(1): 4-12, 2023 Jan 23.
Article in English | MEDLINE | ID: mdl-36711080

ABSTRACT

The GlySpace Alliance was formed in 2018 among the principal investigators of three major glycoscience portals: Glyco@Expasy, GlyCosmos, and GlyGen, representing Europe, Asia, and the United States, respectively. While each of these portals has its unique user interface, the aim is to provide the same basic data set of glycan-related omics data. These portals will be introduced with the aim to enable users to find their target information in the most efficient manner, in particular, in terms of the chemical structures of glycans and their functions.

11.
Methods Mol Biol ; 2499: 135-144, 2022.
Article in English | MEDLINE | ID: mdl-35696078

ABSTRACT

Glycosylation involves the attachment of carbohydrate sugar chains, or glycans, onto an amino acid residue of a protein. These glycans are often branched structures and serve to modulate the function of proteins. Glycans are synthesized through a complex process of enzymatic reactions that occur in the Golgi apparatus in mammalian systems. Because there is currently no sequencer for glycans, technologies such as mass spectrometry is used to characterize glycans in a biological sample to ascertain its glycome. This is a tedious process that requires high levels of expertise and equipment. Thus, the enzymes that work on glycans, called glycogenes or glycoenzymes, have been studied to better understand glycan function. With the development of glycan-related databases and a glycan repository, bioinformatics approaches have attempted to predict the glycosylation pathway and the glycosylation sites on proteins. This chapter introduces these methods and related Web resources for understanding glycan function.


Subject(s)
Amino Acids/metabolism , Golgi Apparatus/metabolism , Mammals/metabolism , Polysaccharides/metabolism , Amino Acids/chemistry , Animals , Computational Biology , Glycogen/metabolism , Glycosylation , Mass Spectrometry , Polysaccharides/chemistry , Proteins/physiology
12.
Glycobiology ; 32(8): 646-650, 2022 07 13.
Article in English | MEDLINE | ID: mdl-35452093

ABSTRACT

High-performance liquid chromatography (HPLC) elution data provide a useful tool for quantitative glycosylation profiling, discriminating isomeric oligosaccharides. The web application Glycoanalysis by the Three Axes of MS and Chromatography (GALAXY), which is based on the three-dimensional HPLC map of N-linked oligosaccharides with pyridyl-2-amination developed by Dr. Noriko Takahashi, has been extensively used for N-glycosylation profiling at molecular, cellular, and tissue levels. Herein, we describe the updated GALAXY as version 3, which includes new HPLC data including those of glucuronylated and sulfated glycans, an improved graphical user interface using modern technologies, and linked to glycan information in GlyTouCan and the GlyCosmos Portal. This liaison will facilitate glycomic analyses of human and other organisms in conjunction with multiomics data.


Subject(s)
Oligosaccharides , Polysaccharides , Chromatography, High Pressure Liquid/methods , Glycosylation , Humans , Oligosaccharides/chemistry , Polysaccharides/chemistry
13.
Glycobiology ; 32(7): 552-555, 2022 06 13.
Article in English | MEDLINE | ID: mdl-35352122

ABSTRACT

Glycan microarrays are essential tools in glycobiology and are being widely used for assignment of glycan ligands in diverse glycan recognition systems. We have developed a new software, called Carbohydrate microArray Analysis and Reporting Tool (CarbArrayART), to address the need for a distributable application for glycan microarray data management. The main features of CarbArrayART include: (i) Storage of quantified array data from different array layouts with scan data and array-specific metadata, such as lists of arrayed glycans, array geometry, information on glycan-binding samples, and experimental protocols. (ii) Presentation of microarray data as charts, tables, and heatmaps derived from the average fluorescence intensity values that are calculated based on the imaging scan data and array geometry, as well as filtering and sorting functions according to monosaccharide content and glycan sequences. (iii) Data export for reporting in Word, PDF, and Excel formats, together with metadata that are compliant with the guidelines of MIRAGE (Minimum Information Required for A Glycomics Experiment). CarbArrayART is designed for routine use in recording, storage, and management of any slide-based glycan microarray experiment. In conjunction with the MIRAGE guidelines, CarbArrayART addresses issues that are critical for glycobiology, namely, clarity of data for evaluation of reproducibility and validity.


Subject(s)
Glycomics , Polysaccharides , Glycomics/methods , Information Storage and Retrieval , Microarray Analysis/methods , Polysaccharides/chemistry , Reproducibility of Results , Software
14.
Molecules ; 27(6)2022 Mar 08.
Article in English | MEDLINE | ID: mdl-35335136

ABSTRACT

Glycan biosynthesis simulation research has progressed remarkably since 1997, when the first mathematical model for N-glycan biosynthesis was proposed. An O-glycan model has also been developed to predict O-glycan biosynthesis pathways in both forward and reverse directions. In this work, we started with a set of O-glycan profiles of CHO cells transiently transfected with various combinations of glycosyltransferases. The aim was to develop a model that encapsulated all the enzymes in the CHO transfected cell lines. Due to computational power restrictions, we were forced to focus on a smaller set of glycan profiles, where we were able to propose an optimized set of kinetics parameters for each enzyme in the model. Using this optimized model we showed that the abundance of more processed glycans could be simulated compared to observed abundance, while predicting the abundance of glycans earlier in the pathway was less accurate. The data generated show that for the accurate prediction of O-linked glycosylation, additional factors need to be incorporated into the model to better reflect the experimental conditions.


Subject(s)
Polysaccharides , Animals , CHO Cells , Computer Simulation , Cricetinae , Cricetulus , Glycosylation , Polysaccharides/metabolism
15.
Carbohydr Res ; 511: 108496, 2022 Jan.
Article in English | MEDLINE | ID: mdl-35030433

ABSTRACT

Unlike DNA and proteins, there is a limit to inferring the structure and function of glycans only by analyzing their sequence. Due to their structural flexibility, it can be said that an understanding of the 3D structural conformations of glycans is important to better understand their functions. While there are several tools now available that aid in analyzing the 3D structures of glycans, they are very computationally intensive and not easily useable by non-experts. Thus, as a first step, we decided to investigate the monosaccharides that make up the building blocks of glycans and their similarities. We developed a method and software that takes the three-dimensional structures of monosaccharides and finds their commonalities through an efficient algorithm, which we call TouCom (tou = "sugar" in Japanese). We then created a similarity matrix to represent the degree of similarity of pairs of monosaccharides based on this information and the properties of their functional groups. We performed an analysis of pairwise glycan alignment using this similarity matrix, confirming that the scores of pairwise-alignments obtained were improved compared to alignments without using this matrix. As a result, we propose the first monosaccharide substitution matrix that has been developed based on 3D atomic structure. In the future, we will apply this matrix to other glycan alignment tools so that glycan sequence analysis can better utilize this information. We expect that this monosaccharide substitution matrix can improve the analysis of glycan function based on glycan structural information.


Subject(s)
Monosaccharides , Polysaccharides , Algorithms , Polysaccharides/chemistry , Proteins , Software
18.
Molecules ; 26(23)2021 Nov 25.
Article in English | MEDLINE | ID: mdl-34885724

ABSTRACT

In life science fields, database integration is progressing and contributing to collaboration between different research fields, including the glycosciences. The integration of glycan databases has greatly progressed collaboration worldwide with the development of the international glycan structure repository, GlyTouCan. This trend has increased the need for a tool by which researchers in various fields can easily search glycan structures from integrated databases. We have developed a web-based glycan structure search tool, SugarDrawer, which supports the depiction of glycans including ambiguity, such as glycan fragments which contain underdetermined linkages, and a database search for glycans drawn on the canvas. This tool provides an easy editing feature for various glycan structures in just a few steps using template structures and pop-up windows which allow users to select specific information for each structure element. This tool has a unique feature for selecting possible attachment sites, which is defined in the Symbol Nomenclature for Glycans (SNFG). In addition, this tool can input and output glycans in WURCS and GlycoCT formats, which are the most commonly-used text formats for glycan structures.


Subject(s)
Databases, Factual , Internet , Polysaccharides/genetics , Software , Biological Science Disciplines , Humans , Polysaccharides/chemistry , Polysaccharides/classification , Polysaccharides/ultrastructure
19.
Molecules ; 26(23)2021 Dec 02.
Article in English | MEDLINE | ID: mdl-34885895

ABSTRACT

Protein N-linked glycosylation is a post-translational modification that plays an important role in a myriad of biological processes. Computational prediction approaches serve as complementary methods for the characterization of glycosylation sites. Most of the existing predictors for N-linked glycosylation utilize the information that the glycosylation site occurs at the N-X-[S/T] sequon, where X is any amino acid except proline. Not all N-X-[S/T] sequons are glycosylated, thus the N-X-[S/T] sequon is a necessary but not sufficient determinant for protein glycosylation. In that regard, computational prediction of N-linked glycosylation sites confined to N-X-[S/T] sequons is an important problem. Here, we report DeepNGlyPred a deep learning-based approach that encodes the positive and negative sequences in the human proteome dataset (extracted from N-GlycositeAtlas) using sequence-based features (gapped-dipeptide), predicted structural features, and evolutionary information. DeepNGlyPred produces SN, SP, MCC, and ACC of 88.62%, 73.92%, 0.60, and 79.41%, respectively on N-GlyDE independent test set, which is better than the compared approaches. These results demonstrate that DeepNGlyPred is a robust computational technique to predict N-Linked glycosylation sites confined to N-X-[S/T] sequon. DeepNGlyPred will be a useful resource for the glycobiology community.


Subject(s)
Proteome/chemistry , Deep Learning , Glycosylation , Humans , Models, Biological , Neural Networks, Computer , Polysaccharides/analysis , Protein Processing, Post-Translational
20.
BMC Microbiol ; 21(1): 325, 2021 11 22.
Article in English | MEDLINE | ID: mdl-34809564

ABSTRACT

BACKGROUND: The abundance of glycomics data that have accumulated has led to the development of many useful databases to aid in the understanding of the function of the glycans and their impact on cellular activity. At the same time, the endeavor for data sharing between glycomics databases with other biological databases have contributed to the creation of new knowledgebases. However, different data types in data description have impeded the data sharing for knowledge integration. To solve this matter, Semantic Web techniques including Resource Description Framework (RDF) and ontology development have been adopted by various groups to standardize the format for data exchange. These semantic data have contributed to the expansion of knowledgebases and hold promises of providing data that can be intelligently processed. On the other hand, bench biologists who are experts in experimental finding are end users and data producers. Therefore, it is indispensable to reduce the technical barrier required for bench biologists to manipulate their experimental data to be compatible with standard formats for data sharing. RESULTS: There are many essential concepts and practical techniques for data integration but there is no method to enable researchers to easily apply Semantic Web techniques to their experimental data. We implemented our procedure on unformatted information of E.coli O-antigen structures collected from the web and show how this information can be expressed as formatted data applicable to Semantic Web standards. In particular, we described the E-coli O-antigen biosynthesis pathway using the BioPAX ontology developed to support data exchange between pathway databases. CONCLUSIONS: The method we implemented to semantically describe O-antigen biosynthesis should be helpful for biologists to understand how glycan information, including relevant pathway reaction data, can be easily shared. We hope this method can contribute to lower the technical barrier that is required when experimental findings are formulated into formal representations and can lead bench scientists to readily participate in the construction of new knowledgebases that are integrated with existing ones. Such integration over the Semantic Web will enable future work in artificial intelligence and machine learning to enable computers to infer new relationships and hypotheses in the life sciences.


Subject(s)
Escherichia coli/metabolism , Information Dissemination , O Antigens/biosynthesis , Biosynthetic Pathways , Escherichia coli/chemistry , Escherichia coli/genetics , O Antigens/chemistry , Semantics
SELECTION OF CITATIONS
SEARCH DETAIL
...