Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38798633

ABSTRACT

Glycosylation is described as a non-templated biosynthesis. Yet, the template-free premise is antithetical to the observation that different N-glycans are consistently placed at specific sites. It has been proposed that glycosite-proximal protein structures could constrain glycosylation and explain the observed microheterogeneity. Using site-specific glycosylation data, we trained a hybrid neural network to parse glycosites (recurrent neural network) and match them to feasible N-glycosylation events (graph neural network). From glycosite-flanking sequences, the algorithm predicts most human N-glycosylation events documented in the GlyConnect database and proposed structures corresponding to observed monosaccharide composition of the glycans at these sites. The algorithm also recapitulated glycosylation in Enhanced Aromatic Sequons, SARS-CoV-2 spike, and IgG3 variants, thus demonstrating the ability of the algorithm to predict both glycan structure and abundance. Thus, protein structure constrains glycosylation, and the neural network enables predictive in silico glycosylation of uncharacterized or novel protein sequences and genetic variants.

2.
Bioinformatics ; 38(Suppl_2): ii162-ii167, 2022 09 16.
Article in English | MEDLINE | ID: mdl-36124803

ABSTRACT

MOTIVATION: We have previously designed and implemented a tree-based ontology to represent glycan structures with the aim of searching these structures with a glyco-driven syntax. This resulted in creating the GlySTreeM knowledge-base as a linchpin of the structural matching procedure and we now introduce a query language, called GlycoQL, for the actual implementation of a glycan structure search. RESULTS: The methodology is described and illustrated with a use-case focused on Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) spike protein glycosylation. We show how to enhance site annotation with federated queries involving UniProt and GlyConnect, our glycoprotein database. AVAILABILITY AND IMPLEMENTATION: https://glyconnect.expasy.org/glycoql/.


Subject(s)
COVID-19 , SARS-CoV-2 , Glycoproteins , Glycosylation , Humans , Polysaccharides/chemistry
3.
Sci Rep ; 12(1): 10846, 2022 06 27.
Article in English | MEDLINE | ID: mdl-35760821

ABSTRACT

Human milk oligosaccharides (HMOs) form the third most abundant component of human milk and are known to convey several benefits to the neonate, including protection from viral and bacterial pathogens, training of the immune system, and influencing the gut microbiome. As HMO production during lactation is driven by enzymes that are common to other glycosylation processes, we adapted a model of mucin-type GalNAc-linked glycosylation enzymes to act on free lactose. We identified a subset of 11 enzyme activities that can account for 206 of 226 distinct HMOs isolated from human milk and constructed a biosynthetic reaction network that identifies 5 new core HMO structures. A comparison of monosaccharide compositions demonstrated that the model was able to discriminate between two possible groups of intermediates between major subnetworks, and to assign possible structures to several previously uncharacterised HMOs. The effect of enzyme knockouts is presented, identifying ß-1,4-galactosyltransferase and ß-1,3-N-acetylglucosaminyltransferase as key enzyme activities involved in the generation of the observed HMO glycosylation patterns. The model also provides a synthesis chassis for the most common HMOs found in lactating mothers.


Subject(s)
Gastrointestinal Microbiome , Milk, Human , Bacteria , Female , Humans , Infant, Newborn , Lactation , Milk, Human/chemistry , Oligosaccharides/chemistry
4.
Methods Mol Biol ; 2370: 41-65, 2022.
Article in English | MEDLINE | ID: mdl-34611864

ABSTRACT

The present chapter focuses on the interactive and explorative aspects of bioinformatics resources that have been recently released in glycobiology. The comparative analysis of data in a field where knowledge is scattered, incomplete, and disconnected from main biology requires efficient visualization, integration, and interactive tools that are currently only partially implemented. This overview highlights converging efforts toward building a consistent picture of protein glycosylation.


Subject(s)
Glycomics , Computational Biology , Glycosylation , Polysaccharides
6.
Methods Mol Biol ; 2361: 109-127, 2021.
Article in English | MEDLINE | ID: mdl-34236658

ABSTRACT

Glycoproteomics is unquestionably on the rise and its current development benefits from past experience in proteomics, in particular when attending to bioinformatics needs. An extensive range of software solutions is available, but the reproducibility of mass spectrometry data processing remains challenging. One of the key issues in running automated glycopeptide identification software is the selection of a reference glycan composition file. The default choices are often too broad, and a fastidious literature search to properly target this selection can be avoided. This chapter suggests the use of GlyConnect Compozitor to collect relevant information on glycosylation in a given tissue or cell line and shape an appropriate glycan composition set that can be input in the majority of search engines accommodating user-defined compositions.


Subject(s)
Polysaccharides/analysis , Glycopeptides , Proteomics , Reproducibility of Results , Software , Tandem Mass Spectrometry
7.
Glycobiology ; 31(7): 741-750, 2021 08 07.
Article in English | MEDLINE | ID: mdl-33677548

ABSTRACT

Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).


Subject(s)
Glycoproteins , Polysaccharides , Glycoproteins/metabolism , Glycosylation , Polysaccharides/metabolism
8.
Molecules ; 27(1)2021 Dec 23.
Article in English | MEDLINE | ID: mdl-35011294

ABSTRACT

The level of ambiguity in describing glycan structure has significantly increased with the upsurge of large-scale glycomics and glycoproteomics experiments. Consequently, an ontology-based model appears as an appropriate solution for navigating these data. However, navigation is not sufficient and the model should also enable advanced search and comparison. A new ontology with a tree logical structure is introduced to represent glycan structures irrespective of the precision of molecular details. The model heavily relies on the GlycoCT encoding of glycan structures. Its implementation in the GlySTreeM knowledge base was validated with GlyConnect data and benchmarked with the Glycowork library. GlySTreeM is shown to be fast, consistent, reliable and more flexible than existing solutions for matching parts of or whole glycan structures. The model is also well suited for painless future expansion.


Subject(s)
Glycomics/methods , Polysaccharides/chemistry , Databases, Factual , Molecular Structure , Structure-Activity Relationship , Web Browser
9.
Nucleic Acids Res ; 49(D1): D1548-D1554, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33174598

ABSTRACT

Lectins are non-covalent glycan-binding proteins mediating cellular interactions but their annotation in newly sequenced organisms is lacking. The limited size of functional domains and the low level of sequence similarity challenge usual bioinformatics tools. The identification of lectin domains in proteomes requires the manual curation of sequence alignments based on structural folds. A new lectin classification is proposed. It is built on three levels: (i) 35 lectin domain folds, (ii) 109 classes of lectins sharing at least 20% sequence similarity and (iii) 350 families of lectins sharing at least 70% sequence similarity. This information is compiled in the UniLectin platform that includes the previously described UniLectin3D database of curated lectin 3D structures. Since its first release, UniLectin3D has been updated with 485 additional 3D structures. The database is now complemented by two additional modules: PropLec containing predicted ß-propeller lectins and LectomeXplore including predicted lectins from sequences of the NBCI-nr and UniProt for every curated lectin class. UniLectin is accessible at https://www.unilectin.eu/.


Subject(s)
Databases, Protein , Genome , Lectins/chemistry , Proteome/chemistry , Receptors, Cell Surface/chemistry , Amino Acid Sequence , Animals , Anthozoa/genetics , Anthozoa/metabolism , Computational Biology/methods , Humans , Internet , Lectins/classification , Lectins/genetics , Lectins/metabolism , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Proteome/classification , Proteome/genetics , Proteome/metabolism , Receptors, Cell Surface/classification , Receptors, Cell Surface/genetics , Receptors, Cell Surface/metabolism , Sequence Alignment , Sequence Homology, Amino Acid , Software , Terminology as Topic
10.
Mol Cell Proteomics ; 19(10): 1602-1618, 2020 10.
Article in English | MEDLINE | ID: mdl-32636234

ABSTRACT

A key point in achieving accurate intact glycopeptide identification is the definition of the glycan composition file that is used to match experimental with theoretical masses by a glycoproteomics search engine. At present, these files are mainly built from searching the literature and/or querying data sources focused on posttranslational modifications. Most glycoproteomics search engines include a default composition file that is readily used when processing MS data. We introduce here a glycan composition visualizing and comparative tool associated with the GlyConnect database and called GlyConnect Compozitor. It offers a web interface through which the database can be queried to bring out contextual information relative to a set of glycan compositions. The tool takes advantage of compositions being related to one another through shared monosaccharide counts and outputs interactive graphs summarizing information searched in the database. These results provide a guide for selecting or deselecting compositions in a file in order to reflect the context of a study as closely as possible. They also confirm the consistency of a set of compositions based on the content of the GlyConnect database. As part of the tool collection of the Glycomics@ExPASy initiative, Compozitor is hosted at https://glyconnect.expasy.org/compozitor/ where it can be run as a web application. It is also directly accessible from the GlyConnect database.


Subject(s)
Glycomics , Polysaccharides/metabolism , Animals , CHO Cells , Cricetulus , Databases, Factual , Humans , Immunoglobulin G/metabolism , Integrins/metabolism , Mucins/metabolism , Polysaccharides/chemistry
11.
Nat Commun ; 10(1): 3275, 2019 07 22.
Article in English | MEDLINE | ID: mdl-31332201

ABSTRACT

The mass spectrometry (MS)-based analysis of free polysaccharides and glycans released from proteins, lipids and proteoglycans increasingly relies on databases and software. Here, we review progress in the bioinformatics analysis of protein-released N- and O-linked glycans (N- and O-glycomics) and propose an e-infrastructure to overcome current deficits in data and experimental transparency. This workflow enables the standardized submission of MS-based glycomics information into the public repository UniCarb-DR. It implements the MIRAGE (Minimum Requirement for A Glycomics Experiment) reporting guidelines, storage of unprocessed MS data in the GlycoPOST repository and glycan structure registration using the GlyTouCan registry, thereby supporting the development and extension of a glycan structure knowledgebase.


Subject(s)
Computational Biology/methods , Glycomics/methods , Glycoproteins/metabolism , Polysaccharides/metabolism , Animals , Computational Biology/standards , Databases, Factual/standards , Databases, Factual/statistics & numerical data , Humans , Mass Spectrometry/methods , Reference Standards
12.
Viruses ; 11(4)2019 04 23.
Article in English | MEDLINE | ID: mdl-31018588

ABSTRACT

Evidence of the mediation of glycan molecules in the interaction between viruses and their hosts is accumulating and is now partially reflected in several online databases. Bioinformatics provides convenient and efficient means of searching, visualizing, comparing, and sometimes predicting, interactions in numerous and diverse molecular biology applications related to the -omics fields. As viromics is gaining momentum, bioinformatics support is increasingly needed. We propose a survey of the current resources for searching, visualizing, comparing, and possibly predicting host-virus interactions that integrate the presence and role of glycans. To the best of our knowledge, we have mapped the specialized and general-purpose databases with the appropriate focus. With an illustration of their potential usage, we also discuss the strong and weak points of the current bioinformatics landscape in the context of understanding viral infection and the immune response to it.


Subject(s)
Computational Biology , Host Microbial Interactions , Polysaccharides/chemistry , Viruses , Databases, Factual , Humans , Receptors, Virus/chemistry , Viral Proteins/chemistry , Virus Diseases/virology
13.
Nucleic Acids Res ; 47(D1): D1236-D1244, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30239928

ABSTRACT

Lectins, and related receptors such as adhesins and toxins, are glycan-binding proteins from all origins that decipher the glycocode, i.e. the structural information encoded in the conformation of complex carbohydrates present on the surface of all cells. Lectins are still poorly classified and annotated, but since their functions are based on ligand recognition, their 3D-structures provide a solid foundation for characterization. UniLectin3D is a curated database that classifies lectins on origin and fold, with cross-links to literature, other databases in glycosciences and functional data such as known specificity. The database provides detailed information on lectins, their bound glycan ligands, and features their interactions using the Protein-Ligand Interaction Profiler (PLIP) server. Special care was devoted to the description of the bound glycan ligands with the use of simple graphical representation and numerical format for cross-linking to other databases in glycoscience. We conceived the design of the database architecture and the navigation tools to account for all organisms, as well as to search for oligosaccharide epitopes complexed within specified binding sites. UniLectin3D is accessible at https://www.unilectin.eu/unilectin3D.


Subject(s)
Computational Biology/methods , Databases, Protein , Protein Conformation , Receptors, Cell Surface/chemistry , Binding Sites , Humans , Internet , Lectins/chemistry , Lectins/metabolism , Ligands , Models, Molecular , Polysaccharides/chemistry , Polysaccharides/metabolism , Protein Binding , Receptors, Cell Surface/metabolism
14.
Glycobiology ; 29(1): 36-44, 2019 01 01.
Article in English | MEDLINE | ID: mdl-30239692

ABSTRACT

Mammalian glycosaminoglycans are linear complex polysaccharides comprising heparan sulfate, heparin, dermatan sulfate, chondroitin sulfate, keratan sulfate and hyaluronic acid. They bind to numerous proteins and these interactions mediate their biological activities. GAG-protein interaction data reported in the literature are curated mostly in MatrixDB database (http://matrixdb.univ-lyon1.fr/). However, a standard nomenclature and a machine-readable format of GAGs together with bioinformatics tools for mining these interaction data are lacking. We report here the building of an automated pipeline to (i) standardize the format of GAG sequences interacting with proteins manually curated from the literature, (ii) translate them into the machine-readable GlycoCT format and into SNFG (Symbol Nomenclature For Glycan) images and (iii) convert their sequences into a format processed by a builder generating three-dimensional structures of polysaccharides based on a repertoire of conformations experimentally validated by data extracted from crystallized GAG-protein complexes. We have developed for this purpose a converter (the CT23D converter) to automatically translate the GlycoCT code of a GAG sequence into the input file required to construct a three-dimensional model.


Subject(s)
Glycosaminoglycans/chemistry , Models, Molecular , Software , Animals , Carbohydrate Conformation , Glycosaminoglycans/genetics , Humans
15.
J Proteome Res ; 18(2): 664-677, 2019 02 01.
Article in English | MEDLINE | ID: mdl-30574787

ABSTRACT

Knowledge of glycoproteins, their site-specific glycosylation patterns, and the glycan structures that they present to their recognition partners in health and disease is gradually being built on using a range of experimental approaches. The data from these analyses are increasingly being standardized and presented in various sources, from supplemental tables in publications to localized servers in investigator laboratories. Bioinformatics tools are now needed to collect these data and enable the user to search, display, and connect glycomics and glycoproteomics to other sources of related proteomics, genomics, and interactomics information. We here introduce GlyConnect ( https://glyconnect.expasy.org/ ), the central platform of the Glycomics@ExPASy portal for glycoinformatics. GlyConnect has been developed to gather, monitor, integrate, and visualize data in a user-friendly way to facilitate the interpretation of collected glycoscience data. GlyConnect is designed to accommodate and integrate multiple data types as they are increasingly produced.


Subject(s)
Glycomics/methods , Proteomics/methods , Software , Computational Biology/methods , Glycomics/instrumentation , Glycoproteins/analysis , Glycosylation , User-Computer Interface
16.
Molecules ; 23(12)2018 Dec 05.
Article in English | MEDLINE | ID: mdl-30563078

ABSTRACT

SugarSketcher is an intuitive and fast JavaScript interface module for online drawing of glycan structures in the popular Symbol Nomenclature for Glycans (SNFG) notation and exporting them to various commonly used formats encoding carbohydrate sequences (e.g., GlycoCT) or quality images (e.g., svg). It does not require a backend server or any specific browser plugins and can be integrated in any web glycoinformatics project. SugarSketcher allows drawing glycans both for glycobiologists and non-expert users. The "quick mode" allows a newcomer to build up a glycan structure having only a limited knowledge in carbohydrate chemistry. The "normal mode" integrates advanced options which enable glycobiologists to tailor complex carbohydrate structures. The source code is freely available on GitHub and glycoinformaticians are encouraged to participate in the development process while users are invited to test a prototype available on the ExPASY web-site and send feedback.


Subject(s)
Polysaccharides/chemistry , Software , Web Browser , Computational Biology/methods , Structure-Activity Relationship
17.
Mol Cell Proteomics ; 17(11): 2164-2176, 2018 11.
Article in English | MEDLINE | ID: mdl-30097532

ABSTRACT

Glycomics@ExPASy (https://www.expasy.org/glycomics) is the glycomics tab of ExPASy, the server of SIB Swiss Institute of Bioinformatics. It was created in 2016 to centralize web-based glycoinformatics resources developed within an international network of glycoscientists. The hosted collection currently includes mainly databases and tools created and maintained at SIB but also links to a range of reference resources popular in the glycomics community. The philosophy of our toolbox is that it should be {glycoscientist AND protein scientist}-friendly with the aim of (1) popularizing the use of bioinformatics in glycobiology and (2) emphasizing the relationship between glycobiology and protein-oriented bioinformatics resources. The scarcity of data bridging these two disciplines led us to design tools as interactive as possible based on database connectivity to facilitate data exploration and support hypothesis building. Glycomics@ExPASy was designed, and is developed, with a long-term vision in close collaboration with glycoscientists to meet as closely as possible the growing needs of the community for glycoinformatics.


Subject(s)
Glycomics/methods , Software , Data Collection , Glycoproteins/metabolism , Humans , Mass Spectrometry , Polysaccharides/metabolism , Protein Interaction Maps
18.
Glycobiology ; 28(6): 349-362, 2018 06 01.
Article in English | MEDLINE | ID: mdl-29518231

ABSTRACT

Nowadays, due to the advance of experimental techniques in glycomics, large collections of glycan profiles are regularly published. The rapid growth of available glycan data accentuates the lack of innovative tools for visualizing and exploring large amount of information. Scientists resort to using general-purpose spreadsheet applications to create ad hoc data visualization. Thus, results end up being encoded in publication images and text, while valuable curated data is stored in files as supplementary information. To tackle this problem, we have built an interactive pipeline composed with three tools: Glynsight, EpitopeXtractor and Glydin'. Glycan profile data can be imported in Glynsight, which generates a custom interactive glycan profile. Several profiles can be compared and glycan composition is integrated with structural data stored in databases. Glycan structures of interest can then be sent to EpitopeXtractor to perform a glycoepitope extraction. EpitopeXtractor results can be superimposed on the Glydin' glycoepitope network. The network visualization allows fast detection of clusters of glycoepitopes and discovery of potential new targets. Each of these tools is standalone or can be used in conjunction with the others, depending on the data and the specific interest of the user. All the tools composing this pipeline are part of the Glycomics@ExPASy initiative and are available at https://www.expasy.org/glycomics.


Subject(s)
Epitopes/chemistry , Glycomics/methods , Informatics/methods , Protein Processing, Post-Translational , Software , Databases, Chemical , Epitopes/immunology , Glycosylation , Humans
19.
Anal Chem ; 89(20): 10932-10940, 2017 10 17.
Article in English | MEDLINE | ID: mdl-28901741

ABSTRACT

Tandem mass spectrometry, when combined with liquid chromatography and applied to complex mixtures, produces large amounts of raw data, which needs to be analyzed to identify molecular structures. This technique is widely used, particularly in glycomics. Due to a lack of high throughput glycan sequencing software, glycan spectra are predominantly sequenced manually. A challenge for writing glycan-sequencing software is that there is no direct template that can be used to infer structures detectable in an organism. To help alleviate this bottleneck, we present Glycoforest 1.0, a partial de novo algorithm for sequencing glycan structures based on MS/MS spectra. Glycoforest was tested on two data sets (human gastric and salmon mucosa O-linked glycomes) for which MS/MS spectra were annotated manually. Glycoforest generated the human validated structure for 92% of test cases. The correct structure was found as the best scoring match for 70% and among the top 3 matches for 83% of test cases. In addition, the Glycoforest algorithm detected glycan structures from MS/MS spectra missing a manual annotation. In total 1532 MS/MS previously unannotated spectra were annotated by Glycoforest. A portion containing 521 spectra was manually checked confirming that Glycoforest annotated an additional 50 MS/MS spectra overlooked during manual annotation.


Subject(s)
Glycomics/methods , Polysaccharides/chemistry , Software , Algorithms , Carbohydrate Sequence , Chromatography, High Pressure Liquid , Tandem Mass Spectrometry
20.
Methods Mol Biol ; 1558: 139-158, 2017.
Article in English | MEDLINE | ID: mdl-28150237

ABSTRACT

UniCarbKB ( http://unicarbkb.org ) is a comprehensive resource for mammalian glycoprotein and annotation data. In particular, the database provides information on the oligosaccharides characterized from a glycoprotein at either the global or site-specific level. This evidence is accumulated from a peer-reviewed and manually curated collection of information on oligosaccharides derived from membrane and secreted glycoproteins purified from biological fluids and/or tissues. This information is further supplemented with experimental method descriptions that summarize important sample preparation and analytical strategies. A new release of UniCarbKB is published every three months, each includes a collection of curated data and improvements to database functionality. In this Chapter, we outline the objectives of UniCarbKB, and describe a selection of step-by-step workflows for navigating the information available. We also provide a short description of web services available and future plans for improving data access. The information presented in this Chapter supplements content available in our knowledgebase including regular updates on interface improvements, new features, and revisions to the database content ( http://confluence.unicarbkb.org ).


Subject(s)
Computational Biology/methods , Databases, Protein , Glycomics/methods , Glycoproteins , Proteome , Proteomics/methods , Animals , Humans , Search Engine , Software , User-Computer Interface , Web Browser
SELECTION OF CITATIONS
SEARCH DETAIL
...