Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters










Publication year range
1.
Methods Mol Biol ; 2670: 303-318, 2023.
Article in English | MEDLINE | ID: mdl-37184712

ABSTRACT

In this chapter, we present Norine ( https://norine.univ-lille.fr/norine ), the unique resource dedicated to nonribosomal peptides. First, the content of the knowledgebase and the related tools are described. Then, a study case shows how to query Norine by annotations or structure and how to interpret the obtained results.


Subject(s)
Computational Biology , Peptides , Peptides/chemistry , Knowledge Bases , Peptide Synthases
2.
J Comput Aided Mol Des ; 36(1): 77-85, 2022 01.
Article in English | MEDLINE | ID: mdl-35059941

ABSTRACT

Nowadays, activity prediction is key to understanding the mechanism-of-action of active structures discovered from phenotypic screening or found in natural products. Machine learning is currently one of the most important and rapidly evolving topics in computer-aided drug discovery to identify and design new drugs with superior biological activities. The performance of a predictive machine learning model can be enhanced through the optimal selection of learning data, algorithm, algorithm parameters, and ensemble methods. In this article, we focus on how to enhance the prediction model using the learning data. However, get an option to add more and accurate data is not easy and available in many cases. This motivated us to propose the turbo prediction model, in which nearest neighbour structures are used to increase prediction accuracy. Five datasets, well known in the literature, were used in this article and experimental results show that turbo prediction can improve the quality prediction of the conventional prediction models, particularly for heterogeneous datasets, without any additional effort on the part of the user carrying out the prediction process, and at a minimal computational cost.


Subject(s)
Drug Discovery , Machine Learning , Algorithms , Cluster Analysis , Drug Discovery/methods
3.
J Comput Aided Mol Des ; 35(5): 657-665, 2021 05.
Article in English | MEDLINE | ID: mdl-33797669

ABSTRACT

The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.


Subject(s)
Algorithms , Drug Discovery/methods , Pharmaceutical Preparations/chemistry , Small Molecule Libraries/chemistry , Humans , Pharmacology , Small Molecule Libraries/pharmacology
4.
Anal Chem ; 92(24): 15862-15871, 2020 12 15.
Article in English | MEDLINE | ID: mdl-33226770

ABSTRACT

The various bioactivity types and potencies of peptidic natural products (PNPs) are of high interest for the development of new drugs. In particular, the intrinsic antibiotic properties of PNPs appear essential to combat antimicrobial resistance that is currently threatening the world. The first steps in dereplication and characterization of PNPs often involve tandem mass spectrometry (MS/MS). However, such structurally complex peptides challenge the interpretation of MS/MS results. Only a few software solutions are dedicated to PNP analysis but with a mutually exclusive focus on dereplication or annotation. Hence, key functionalities such as automatic peak annotation or statistically validated scoring systems to support the characterization/identification processes are missing. Here, we present NRPro, a new MS/MS analysis platform that overcomes some limitations of the existing software and provides a comprehensive toolset for both automatic annotation and dereplication of PNPs.


Subject(s)
Automation , Biological Products/analysis , Peptides/analysis , Molecular Structure , Particle Size , Software , Surface Properties , Tandem Mass Spectrometry
5.
J Comput Aided Mol Des ; 34(11): 1147-1156, 2020 11.
Article in English | MEDLINE | ID: mdl-32812076

ABSTRACT

Previously a fingerprint based on monomer composition (MCFP) of nonribosomal peptides (NRPs) has been introduced. MCFP is a novel method for obtaining a representative description of NRP structures from their monomer composition in a fingerprint form. An effective screening and prediction of biological activities has been obtained from Norine NRPs database. In this paper, we present an extension of the MCFP fingerprint. This extension is based on adding few columns into the fingerprint; representing monomer clusters, 2D structures, peptide categories, and peptide diversity. All these data have been extracted from the NRP structure. Experiments with Norine NRPs database showed that the extended MCFP, that can be called Monomer Structure FingerPrint (MSFP) produced high prediction accuracy (> 95%) together with a high recall rate (86%) obtained when MSFP was used for prediction and similarity searching. From this study it appeared that MSFP mainly built from monomer composition can substantially be improved by adding more columns representing useful information about monomer composition and 2D structure of NRPs.


Subject(s)
Peptide Mapping , Peptides/chemistry , Amino Acid Sequence , Databases, Protein , Models, Molecular , Protein Conformation , Structure-Activity Relationship
6.
Bioinformatics ; 36(15): 4345-4347, 2020 08 01.
Article in English | MEDLINE | ID: mdl-32415965

ABSTRACT

SUMMARY: To support small and large-scale genome mining projects, we present Post-processing Analysis tooLbox for ANTIsmash Reports (Palantir), a dedicated software suite for handling and refining secondary metabolite biosynthetic gene cluster (BGC) data annotated with the popular antiSMASH pipeline. Palantir provides new functionalities building on NRPS/PKS predictions from antiSMASH, such as improved BGC annotation, module delineation and easy access to sub-sequences at different levels (cluster, gene, module and domain). Moreover, it can parse user-provided antiSMASH reports and reformat them for direct use or storage in a relational database. AVAILABILITY AND IMPLEMENTATION: Palantir is released both as a Perl API available on CPAN (https://metacpan.org/release/Bio-Palantir) and as a web application (http://palantir.uliege.be). As a practical use case, the web interface also features a database built from the mining of 1616 cyanobacterial genomes, of which 1488 were predicted to encode at least one BGC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Biosynthetic Pathways , Software , Bacteria/genetics , Molecular Sequence Annotation , Multigene Family
7.
Nucleic Acids Res ; 48(D1): D465-D469, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31691799

ABSTRACT

Norine, the unique resource dedicated to nonribosomal peptides (NRPs), is now updated with a new pipeline to automate massive sourcing and enhance annotation. External databases are mined to extract NRPs that are not yet in Norine. To maintain a high data quality, successive filters are applied to automatically validate the NRP annotations and only validated data is inserted in the database. External databases were also used to complete annotations of NRPs already in Norine. Besides, annotation consistency inside Norine and between Norine and external sources have reported annotation errors. Some can be corrected automatically, while others need manual curation. This new approach led to the insertion of 539 new NRPs and the addition or correction of annotations of nearly all Norine entries. Two new tools to analyse the chemical structures of NRPs (rBAN) and to infer a molecular formula from the mass-to-charge ratio of an NRP (Kendrick Formula Predictor) were also integrated. Norine is freely accessible from the following URL: https://bioinfo.cristal.univ-lille.fr/norine/.


Subject(s)
Databases, Protein , Peptide Biosynthesis, Nucleic Acid-Independent , Software , Bacterial Proteins/biosynthesis , Bacterial Proteins/chemistry , Fungal Proteins/biosynthesis , Fungal Proteins/chemistry
8.
J Am Soc Mass Spectrom ; 30(12): 2608-2616, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31659720

ABSTRACT

The identification of known (dereplication) or unknown nonribosomal peptides (NRPs) produced by microorganisms is a time consuming, expensive, and challenging task where mass spectrometry and nuclear magnetic resonance play a key role. The first step of the identification process always involves the establishment of a molecular formula. Unfortunately, the number of potential molecular formulae increases significantly with higher molecular masses and the lower precision of their measurements. In the present article, we demonstrate that molecular formula assignment can be achieved by a combined approach using the regular Kendrick mass defect (RKMD) and NORINE, the reference curated database of NRPs. We observed that irrespective of the molecular formula, the addition and subtraction of a given atom or atom group always leads to the same RKMD variation and nominal Kendrick mass (NKM). Graphically, these variations translated into a vector mesh can be used to connect an unknown molecule to a known NRP of the NORINE database and establish its molecular formula. We explain and illustrate this concept through the high-resolution mass spectrometry analysis of a commercially available mixture composed of four surfactins. The Kendrick approach enriched with the NORINE database content is a fast, useful, and easy-to-use tool for molecular mass assignment of known and unknown NRP structures.


Subject(s)
Peptides/chemistry , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Databases, Protein , Lipopeptides/chemistry , Molecular Weight , Peptide Biosynthesis, Nucleic Acid-Independent , Peptides, Cyclic/chemistry , Protons , Software
9.
J Cheminform ; 11(1): 13, 2019 Feb 08.
Article in English | MEDLINE | ID: mdl-30737579

ABSTRACT

Proteinogenic and non-proteinogenic amino acids, fatty acids or glycans are some of the main building blocks of nonribsosomal peptides (NRPs) and as such may give insight into the origin, biosynthesis and bioactivities of their constitutive peptides. Hence, the structural representation of NRPs using monomers provides a biologically interesting skeleton of these secondary metabolites. Databases dedicated to NRPs such as Norine, already integrate monomer-based annotations in order to facilitate the development of structural analysis tools. In this paper, we present rBAN (retro-biosynthetic analysis of nonribosomal peptides), a new computational tool designed to predict the monomeric graph of NRPs from their atomic structure in SMILES format. This prediction is achieved through the "in silico" fragmentation of a chemical structure and matching the resulting fragments against the monomers of Norine for identification. Structures containing monomers not yet recorded in Norine, are processed in a "discovery mode" that uses the RESTful service from PubChem to search the unidentified substructures and suggest new monomers. rBAN was integrated in a pipeline for the curation of Norine data in which it was used to check the correspondence between the monomeric graphs annotated in Norine and SMILES-predicted graphs. The process concluded with the validation of the 97.26% of the records in Norine, a two-fold extension of its SMILES data and the introduction of 11 new monomers suggested in the discovery mode. The accuracy, robustness and high-performance of rBAN were demonstrated in benchmarking it against other tools with the same functionality: Smiles2Monomers and GRAPE.

10.
Environ Sci Pollut Res Int ; 25(30): 29794-29807, 2018 Oct.
Article in English | MEDLINE | ID: mdl-28547376

ABSTRACT

Bacteria belonging to the genus Burkholderia live in various ecological niches and present a significant role in the environments through the excretion of a wide variety of secondary metabolites including modular nonribosomal peptides (NRPs) and polyketides (PKs). These metabolites represent a widely distributed biomedically and biocontrol important class of natural products including antibiotics, siderophores, and anticancers as well as biopesticides that are considered as a novel source that can be used to defend ecological niche from competitors and to promote plant growth. The aim of this review is to present all NRPs produced or potentially produced by strains of Burkholderia, as NRPs represent a major source of active compounds implicated in biocontrol. The review is a compilation of results from a large screening we have performed on 48 complete sequenced genomes available in NCBI to identify NRPS gene clusters, and data found in the literature mainly because some interesting compounds are produced by strains not yet sequenced. In addition to NRPs, hybrids NRPs/PKs are also included. Specific features about biosynthetic gene clusters and structures of the modular enzymes responsible for the synthesis, the biological activities, and the potential uses in agriculture and pharmaceutical of NRPs and hybrids NRPs/PKs will also be discussed.


Subject(s)
Biological Control Agents/pharmacology , Burkholderia/metabolism , Peptides/pharmacology , Polyketides/pharmacology , Burkholderia/genetics , Genome, Bacterial , Humans
11.
Microbiologyopen ; 5(3): 512-26, 2016 06.
Article in English | MEDLINE | ID: mdl-27060604

ABSTRACT

Burkholderia is an important genus encompassing a variety of species, including pathogenic strains as well as strains that promote plant growth. We have carried out a global strategy, which combined two complementary approaches. The first one is genome guided with deep analysis of genome sequences and the second one is assay guided with experiments to support the predictions obtained in silico. This efficient screening for new secondary metabolites, performed on 48 gapless genomes of Burkholderia species, revealed a total of 161 clusters containing nonribosomal peptide synthetases (NRPSs), with the potential to synthesize at least 11 novel products. Most of them are siderophores or lipopeptides, two classes of products with potential application in biocontrol. The strategy led to the identification, for the first time, of the cluster for cepaciachelin biosynthesis in the genome of Burkholderia ambifaria AMMD and a cluster corresponding to a new malleobactin-like siderophore, called phymabactin, was identified in Burkholderia phymatum STM815 genome. In both cases, the siderophore was produced when the strain was grown in iron-limited conditions. Elsewhere, the cluster for the antifungal burkholdin was detected in the genome of B. ambifaria AMMD and also Burkholderia sp. KJ006. Burkholderia pseudomallei strains harbor the genetic potential to produce a novel lipopeptide called burkhomycin, containing a peptidyl moiety of 12 monomers. A mixture of lipopeptides produced by Burkholderia rhizoxinica lowered the surface tension of the supernatant from 70 to 27 mN·m(-1) . The production of nonribosomal secondary metabolites seems related to the three phylogenetic groups obtained from 16S rRNA sequences. Moreover, the genome-mining approach gave new insights into the nonribosomal synthesis exemplified by the identification of dual C/E domains in lipopeptide NRPSs, up to now essentially found in Pseudomonas strains.


Subject(s)
Burkholderia pseudomallei/genetics , Burkholderia pseudomallei/metabolism , Genome, Bacterial/genetics , Lipopeptides/biosynthesis , Peptide Synthases/metabolism , Siderophores/biosynthesis , Antifungal Agents/metabolism , Bacterial Proteins/biosynthesis , Base Sequence , DNA, Bacterial/genetics , Gene Expression Profiling , Lipopeptides/chemistry , RNA, Ribosomal, 16S/genetics , Sequence Analysis, DNA , Siderophores/chemistry
12.
Methods Mol Biol ; 1401: 209-32, 2016.
Article in English | MEDLINE | ID: mdl-26831711

ABSTRACT

This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two-dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i.e., antibiotics against multi-resistant pathogens or anti-tumors).


Subject(s)
Bacteria/genetics , Bacterial Proteins/genetics , Computational Biology/methods , Peptide Biosynthesis, Nucleic Acid-Independent , Peptide Synthases/genetics , Peptides/genetics , Bacteria/chemistry , Bacteria/metabolism , Bacterial Proteins/chemistry , Bacterial Proteins/metabolism , Genome, Bacterial , Multigene Family , Peptide Synthases/chemistry , Peptide Synthases/metabolism , Peptides/chemistry , Peptides/metabolism , Software
13.
Nucleic Acids Res ; 44(D1): D1113-8, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26527733

ABSTRACT

Since its creation in 2006, Norine remains the unique knowledgebase dedicated to non-ribosomal peptides (NRPs). These secondary metabolites, produced by bacteria and fungi, harbor diverse interesting biological activities (such as antibiotic, antitumor, siderophore or surfactant) directly related to the diversity of their structures. The Norine team goal is to collect the NRPs and provide tools to analyze them efficiently. We have developed a user-friendly interface and dedicated tools to provide a complete bioinformatics platform. The knowledgebase gathers abundant and valuable annotations on more than 1100 NRPs. To increase the quantity of described NRPs and improve the quality of associated annotations, we are now opening Norine to crowdsourcing. We believe that contributors from the scientific community are the best experts to annotate the NRPs they work on. We have developed MyNorine to facilitate the submission of new NRPs or modifications of stored ones. This article presents MyNorine and other novelties of Norine interface released since the first publication. Norine is freely accessible from the following URL: http://bioinfo.lifl.fr/NRP.


Subject(s)
Databases, Chemical , Peptides/chemistry , Peptides/pharmacology , Internet , Knowledge Bases , Molecular Sequence Annotation , Peptides/metabolism
14.
J Cheminform ; 7: 62, 2015.
Article in English | MEDLINE | ID: mdl-26715946

ABSTRACT

BACKGROUND: The monomeric composition of polymers is powerful for structure comparison and synthetic biology, among others. Many databases give access to the atomic structure of compounds but the monomeric structure of polymers is often lacking. We have designed a smart algorithm, implemented in the tool Smiles2Monomers (s2m), to infer efficiently and accurately the monomeric structure of a polymer from its chemical structure. RESULTS: Our strategy is divided into two steps: first, monomers are mapped on the atomic structure by an efficient subgraph-isomorphism algorithm ; second, the best tiling is computed so that non-overlapping monomers cover all the structure of the target polymer. The mapping is based on a Markovian index built by a dynamic programming algorithm. The index enables s2m to search quickly all the given monomers on a target polymer. After, a greedy algorithm combines the mapped monomers into a consistent monomeric structure. Finally, a local branch and cut algorithm refines the structure. We tested this method on two manually annotated databases of polymers and reconstructed the structures de novo with a sensitivity over 90 %. The average computation time per polymer is 2 s. CONCLUSION: s2m automatically creates de novo monomeric annotations for polymers, efficiently in terms of time computation and sensitivity. s2m allowed us to detect annotation errors in the tested databases and to easily find the accurate structures. So, s2m could be integrated into the curation process of databases of small compounds to verify the current entries and accelerate the annotation of new polymers. The full method can be downloaded or accessed via a website for peptide-like polymers at http://bioinfo.lifl.fr/norine/smiles2monomers.jsp.Graphical abstract:.

15.
PLoS One ; 9(1): e85667, 2014.
Article in English | MEDLINE | ID: mdl-24465643

ABSTRACT

Nonribosomal peptides represent a large variety of natural active compounds produced by microorganisms. Due to their specific biosynthesis pathway through large assembly lines called NonRibosomal Peptide Synthetases (NRPSs), they often display complex structures with cycles and branches. Moreover they often contain non proteogenic or modified monomers, such as the D-monomers produced by epimerization. We investigate here some sequence specificities of the condensation (C) and epimerization (E) domains of NRPS that can be used to predict the possible isomeric state (D or L) of each monomer in a putative peptide. We show that C- and E- domains can be divided into 2 sub-regions called Up-Seq and Down-Seq. The Up-Seq region corresponds to an InterPro domain (IPR001242) and is shared by C- and E-domains. The Down-Seq region is specific to the enzymatic activity of the domain. Amino-acid signatures (represented as sequence logos) previously described for complete C-and E-domains have been restricted to the Down-Seq region and amplified thanks to additional sequences. Moreover a new Down-Seq signature has been found for Ct-domains found in fungi and responsible for terminal cyclization of the peptides. The identification of these signatures has been included in a workflow named Florine, aimed to predict nonribosomal peptides from NRPS sequence analyses. In some cases, the prediction of isomery is guided by genus-specific rules. Florine was used on a Pseudomonas genome to allow the determination of the type of pyoverdin produced, the update of syringafactin structure and the identification of novel putative products.


Subject(s)
Bacterial Proteins/chemistry , DNA, Bacterial/chemistry , Peptide Synthases/chemistry , Peptides/chemistry , Pseudomonas/chemistry , Software , Amino Acid Sequence , Bacterial Proteins/genetics , DNA, Bacterial/genetics , Molecular Sequence Annotation , Molecular Sequence Data , Oligopeptides/chemistry , Oligopeptides/genetics , Peptide Biosynthesis, Nucleic Acid-Independent/genetics , Peptide Synthases/genetics , Peptides/genetics , Protein Multimerization , Protein Structure, Tertiary , Pseudomonas/genetics
16.
J Chem Inf Model ; 54(1): 30-6, 2014 Jan 27.
Article in English | MEDLINE | ID: mdl-24392938

ABSTRACT

Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.


Subject(s)
Bayes Theorem , Drug Discovery/statistics & numerical data , Algorithms , Artificial Intelligence , Computational Biology , Databases, Chemical , Databases, Pharmaceutical , Drug Evaluation, Preclinical , Models, Chemical , Quantitative Structure-Activity Relationship , User-Computer Interface
17.
J Comput Aided Mol Des ; 26(10): 1187-94, 2012 Oct.
Article in English | MEDLINE | ID: mdl-23053735

ABSTRACT

Bacteria and fungi use a set of enzymes called nonribosomal peptide synthetases to provide a wide range of natural peptides displaying structural and biological diversity. So, nonribosomal peptides (NRPs) are the basis for some efficient drugs. While discovering new NRPs is very desirable, the process of identifying their biological activity to be used as drugs is a challenge. In this paper, we present a novel peptide fingerprint based on monomer composition (MCFP) of NRPs. MCFP is a novel method for obtaining a representative description of NRP structures from their monomer composition in fingerprint form. Experiments with Norine NRPs database and MCFP show high prediction accuracy (>93 %). Also a high recall rate (>82 %) is obtained when MCFP is used for screening NRPs database. From this study it appears that our fingerprint, built from monomer composition, allows an effective screening and prediction of biological activities of NRPs database.


Subject(s)
Bacteria/enzymology , Drug Discovery/methods , Fungi/enzymology , Peptide Synthases/metabolism , Peptides/chemistry , Peptides/pharmacology , Databases, Pharmaceutical , Databases, Protein , Peptides/metabolism
18.
Appl Microbiol Biotechnol ; 95(3): 593-600, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22678024

ABSTRACT

A new family of lipopeptides produced by Bacillus thuringiensis, the kurstakins, was discovered in 2000 and considered as a biomarker of this species. Kurstakins are lipoheptapeptides displaying antifungal activities against Stachybotrys charatum. Recently, the biosynthesis mechanism, the regulation of this biosynthesis and the potential new properties of kurstakins were described in the literature. In addition, kurstakins were also detected in other species belonging to Bacillus genus such as Bacillus cereus. This mini-review gathers all the information about these promising bioactive molecules.


Subject(s)
Bacillus thuringiensis/metabolism , Lipopeptides/biosynthesis , Lipopeptides/chemistry , Antifungal Agents/chemistry , Bacillus cereus/metabolism , Molecular Structure , Peptide Biosynthesis, Nucleic Acid-Independent , Protein Conformation
19.
J Bacteriol ; 192(19): 5143-50, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20693331

ABSTRACT

Nonribosomal peptides (NRPs) are molecules produced by microorganisms that have a broad spectrum of biological activities and pharmaceutical applications (e.g., antibiotic, immunomodulating, and antitumor activities). One particularity of the NRPs is the biodiversity of their monomers, extending far beyond the 20 proteogenic amino acid residues. Norine, a comprehensive database of NRPs, allowed us to review for the first time the main characteristics of the NRPs and especially their monomer biodiversity. Our analysis highlighted a significant similarity relationship between NRPs synthesized by bacteria and those isolated from metazoa, especially from sponges, supporting the hypothesis that some NRPs isolated from sponges are actually synthesized by symbiotic bacteria rather than by the sponges themselves. A comparison of peptide monomeric compositions as a function of biological activity showed that some monomers are specific to a class of activities. An analysis of the monomer compositions of peptide products predicted from genomic information (metagenomics and high-throughput genome sequencing) or of new peptides detected by mass spectrometry analysis applied to a culture supernatant can provide indications of the origin of a peptide and/or its biological activity.


Subject(s)
Peptides/chemistry , Databases, Factual , Models, Theoretical , Peptide Synthases/metabolism , Peptides/metabolism
20.
BMC Struct Biol ; 9: 15, 2009 Mar 18.
Article in English | MEDLINE | ID: mdl-19296847

ABSTRACT

BACKGROUND: Nonribosomal peptides (NRPs), bioactive secondary metabolites produced by many microorganisms, show a broad range of important biological activities (e.g. antibiotics, immunosuppressants, antitumor agents). NRPs are mainly composed of amino acids but their primary structure is not always linear and can contain cycles or branchings. Furthermore, there are several hundred different monomers that can be incorporated into NRPs. The NORINE database, the first resource entirely dedicated to NRPs, currently stores more than 700 NRPs annotated with their monomeric peptide structure encoded by undirected labeled graphs. This opens a way to a systematic analysis of structural patterns occurring in NRPs. Such studies can investigate the functional role of some monomeric chains, or analyse NRPs that have been computationally predicted from the synthetase protein sequence. A basic operation in such analyses is the search for a given structural pattern in the database. RESULTS: We developed an efficient method that allows for a quick search for a structural pattern in the NORINE database. The method identifies all peptides containing a pattern substructure of a given size. This amounts to solving a variant of the maximum common subgraph problem on pattern and peptide graphs, which is done by computing cliques in an appropriate compatibility graph. CONCLUSION: The method has been incorporated into the NORINE database, available at http://bioinfo.lifl.fr/norine. Less than one second is needed to search for a pattern in the entire database.


Subject(s)
Databases, Protein , Peptide Biosynthesis, Nucleic Acid-Independent , Peptides/chemistry , Internet , Protein Conformation , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...