Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 38(18): 4452-4453, 2022 09 15.
Article in English | MEDLINE | ID: mdl-35920772

ABSTRACT

SUMMARY: Newly discovered functional relationships of (bio-)molecules are a key component in molecular biology and life science research. Especially in the drug discovery field, knowledge of how small molecules associated with proteins plays a fundamental role in understanding how drugs or metabolites can affect cells, tissues and human metabolism. Finding relevant information about these relationships among the huge number of published articles is becoming increasingly challenging and time-consuming. On average, more than 25 000 new (bio-)medical articles are added to the literature database PubMed weekly. In this article, we present a new web server [compound-protein relationships in literature (CPRiL)] that provides information on functional relationships between small molecules and proteins in literature. Currently, CPRiL contains ∼465 000 unique names and synonyms of small molecules, ∼100 000 unique proteins and more than 9 million described functional relationships between these entities. The applied BioBERT machine learning model for the determination of functional relationships between small molecules and proteins in texts was extensively trained and tested. On a related benchmark, CPRiL yielded a high performance, with an F1 score of 84.3%, precision of 82.9% and recall of 85.7%. AVAILABILITY AND IMPLEMENTATION: CPRiL is freely available at https://www.pharmbioinf.uni-freiburg.de/cpril. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Proteins , Software , Humans , PubMed , Publications , Databases, Factual
2.
Nucleic Acids Res ; 50(D1): D445-D450, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34581813

ABSTRACT

In recent years, the drug discovery paradigm has shifted toward compounds that covalently modify disease-associated target proteins, because they tend to possess high potency, selectivity, and duration of action. The rational design of novel targeted covalent inhibitors (TCIs) typically starts from resolved macromolecular structures of target proteins in their apo or holo forms. However, the existing TCI databases contain only a paucity of covalent protein-ligand (cP-L) complexes. Herein, we report CovPDB, the first database solely dedicated to high-resolution cocrystal structures of biologically relevant cP-L complexes, curated from the Protein Data Bank. For these curated complexes, the chemical structures and warheads of pre-reactive electrophilic ligands as well as the covalent bonding mechanisms to their target proteins were expertly manually annotated. Totally, CovPDB contains 733 proteins and 1,501 ligands, relating to 2,294 cP-L complexes, 93 reactive warheads, 14 targetable residues, and 21 covalent mechanisms. Users are provided with an intuitive and interactive web interface that allows multiple search and browsing options to explore the covalent interactome at a molecular level in order to develop novel TCIs. CovPDB is freely accessible at http://www.pharmbioinf.uni-freiburg.de/covpdb/ and its contents are available for download as flat files of various formats.


Subject(s)
Databases, Protein , Proteins/chemistry , Small Molecule Libraries/chemistry , Software , Binding Sites , Drug Discovery/methods , Humans , Internet , Ligands , Molecular Sequence Annotation , Protein Binding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Proteins/agonists , Proteins/antagonists & inhibitors , Small Molecule Libraries/metabolism
3.
J Chem Inf Model ; 61(11): 5327-5330, 2021 11 22.
Article in English | MEDLINE | ID: mdl-34738791

ABSTRACT

While aromatic cages have extensively been investigated in the context of structural biology, molecular recognition, and drug discovery, there exist to date no comprehensive resource for proteins sharing this conserved structural motif. To this end, we parsed the Protein Data Bank and thus constructed the Aromatic Cage Database (AroCageDB), a database for investigating the binding pocket descriptors and ligand binding space of aromatic-cage-containing proteins (ACCPs). AroCageDB contains 487 unique ACCPs bound to 890 unique ligands, for a total of 1636 complexes. This web-accessible database provides a user-friendly interface for the interactive visualization of ligand-bound ACCP structures, with a variety of search options that will open up opportunities for structural analyses and drug discovery campaigns. AroCageDB is freely available at http://www.pharmbioinf.uni-freiburg.de/arocagedb/.


Subject(s)
Internet , Proteins , Binding Sites , Databases, Protein , Ligands , User-Computer Interface
4.
Nucleic Acids Res ; 49(D1): D600-D604, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33051671

ABSTRACT

Antimicrobial resistance is an emerging global health threat necessitating the rapid development of novel antimicrobials. Remarkably, the vast majority of currently available antibiotics are natural products (NPs) isolated from streptomycetes, soil-dwelling bacteria of the genus Streptomyces. However, there is still a huge reservoir of streptomycetes NPs which remains pharmaceutically untapped and a compendium thereof could serve as a source of inspiration for the rational design of novel antibiotics. Initially released in 2012, StreptomeDB (http://www.pharmbioinf.uni-freiburg.de/streptomedb) is the first and only public online database that enables the interactive phylogenetic exploration of streptomycetes and their isolated or mutasynthesized NPs. In this third release, there are substantial improvements over its forerunners, especially in terms of data content. For instance, about 2500 unique NPs were newly annotated through manual curation of about 1300 PubMed-indexed articles, published in the last five years since the second release. To increase interoperability, StreptomeDB entries were hyperlinked to several spectral, (bio)chemical and chemical vendor databases, and also to a genome-based NP prediction server. Moreover, predicted pharmacokinetic and toxicity profiles were added. Lastly, some recent real-world use cases of StreptomeDB are highlighted, to illustrate its applicability in life sciences.


Subject(s)
Biological Products/chemistry , Databases, Chemical , Streptomyces/metabolism , Anti-Bacterial Agents/chemistry
5.
Mol Inform ; 39(11): e2000163, 2020 11.
Article in English | MEDLINE | ID: mdl-32964659

ABSTRACT

Medicinal plants have widely been used in the traditional treatment of ailments and have been proven effective. Their contribution still holds an important place in modern drug discovery due to their chemical, and biological diversities. However, the poor documentation of traditional medicine, in developing African countries for instance, can lead to the loss of knowledge related to such practices. In this study, we present the Eastern Africa Natural Products Database (EANPDB) containing the structural and bioactivity information of 1870 unique molecules isolated from about 300 source species from the Eastern African region. This represents the largest collection of natural products (NPs) from this geographical region, covering literature data of the period from 1962 to 2019. The computed physicochemical properties and toxicity profiles of each compound have been included. A comparative analysis of some physico-chemical properties like molecular weight, H-bond donor/acceptor, logPo/w , etc. as well scaffold diversity analysis has been carried out with other published NP databases. EANPDB was combined with the previously published Northern African Natural Products Database (NANPDB), to form a merger African Natural Products Database (ANPDB), containing ∼6500 unique molecules isolated from about 1000 source species (freely available at http://african-compounds.org). As a case study, latrunculins A and B isolated from the sponge Negombata magnifica (Podospongiidae) with previously reported antitumour activities, were identified via substructure searching as molecules to be explored as putative binders of histone deacetylases (HDACs).


Subject(s)
Biological Products/pharmacology , Plants, Medicinal/chemistry , Africa, Eastern , Biological Products/chemistry , Bridged Bicyclo Compounds, Heterocyclic/chemistry , Databases as Topic , Histone Deacetylase Inhibitors/chemistry , Hydrogen Bonding , Molecular Weight , Thiazolidines/chemistry , Toxicity Tests
6.
PLoS One ; 15(3): e0220925, 2020.
Article in English | MEDLINE | ID: mdl-32126064

ABSTRACT

MOTIVATION: Much effort has been invested in the identification of protein-protein interactions using text mining and machine learning methods. The extraction of functional relationships between chemical compounds and proteins from literature has received much less attention, and no ready-to-use open-source software is so far available for this task. METHOD: We created a new benchmark dataset of 2,613 sentences from abstracts containing annotations of proteins, small molecules, and their relationships. Two kernel methods were applied to classify these relationships as functional or non-functional, named shallow linguistic and all-paths graph kernel. Furthermore, the benefit of interaction verbs in sentences was evaluated. RESULTS: The cross-validation of the all-paths graph kernel (AUC value: 84.6%, F1 score: 79.0%) shows slightly better results than the shallow linguistic kernel (AUC value: 82.5%, F1 score: 77.2%) on our benchmark dataset. Both models achieve state-of-the-art performance in the research area of relation extraction. Furthermore, the combination of shallow linguistic and all-paths graph kernel could further increase the overall performance slightly. We used each of the two kernels to identify functional relationships in all PubMed abstracts (29 million) and provide the results, including recorded processing time. AVAILABILITY: The software for the tested kernels, the benchmark, the processed 29 million PubMed abstracts, all evaluation scripts, as well as the scripts for processing the complete PubMed database are freely available at https://github.com/KerstenDoering/CPI-Pipeline.


Subject(s)
Proteins/chemistry , Publications , Algorithms , Automation , Databases, Factual , Linguistics , Machine Learning
SELECTION OF CITATIONS
SEARCH DETAIL
...