Pesquisa | Portal Regional da BVS (teste)

1.

Cheminformatics Microservice: unifying access to open cheminformatics toolkits.

Chandrasekhar, Venkata; Sharma, Nisha; Schaub, Jonas; Steinbeck, Christoph; Rajan, Kohulan.

J Cheminform ; 15(1): 98, 2023 Oct 16.

Artigo em Inglês | MEDLINE | ID: mdl-37845745

RESUMO

In recent years, cheminformatics has experienced significant advancements through the development of new open-source software tools based on various cheminformatics programming toolkits. However, adopting these toolkits presents challenges, including proper installation, setup, deployment, and compatibility management. In this work, we present the Cheminformatics Microservice. This open-source solution provides a unified interface for accessing commonly used functionalities of multiple cheminformatics toolkits, namely RDKit, Chemistry Development Kit (CDK), and Open Babel. In addition, more advanced functionalities like structure generation and Optical Chemical Structure Recognition (OCSR) are made available through the Cheminformatics Microservice based on pre-existing tools. The software service also enables developers to extend the functionalities easily and to seamlessly integrate them with existing workflows and applications. It is built on FastAPI and containerized using Docker, making it highly scalable. An instance of the microservice is publicly available at https://api.naturalproducts.net . The source code is publicly accessible on GitHub, accompanied by comprehensive documentation, version control, and continuous integration and deployment workflows. All resources can be found at the following link: https://github.com/Steinbeck-Lab/cheminformatics-microservice .

2.

Open data and algorithms for open science in AI-driven molecular informatics.

Brinkhaus, Henning Otto; Rajan, Kohulan; Schaub, Jonas; Zielesny, Achim; Steinbeck, Christoph.

Curr Opin Struct Biol ; 79: 102542, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-36805192

RESUMO

Recent years have seen a sharp increase in the development of deep learning and artificial intelligence-based molecular informatics. There has been a growing interest in applying deep learning to several subfields, including the digital transformation of synthetic chemistry, extraction of chemical information from the scientific literature, and AI in natural product-based drug discovery. The application of AI to molecular informatics is still constrained by the fact that most of the data used for training and testing deep learning models are not available as FAIR and open data. As open science practices continue to grow in popularity, initiatives which support FAIR and open data as well as open-source software have emerged. It is becoming increasingly important for researchers in the field of molecular informatics to embrace open science and to submit data and software in open repositories. With the advent of open-source deep learning frameworks and cloud computing platforms, academic researchers are now able to deploy and test their own deep learning models with ease. With the development of new and faster hardware for deep learning and the increasing number of initiatives towards digital research data management infrastructures, as well as a culture promoting open data, open source, and open science, AI-driven molecular informatics will continue to grow. This review examines the current state of open data and open algorithms in molecular informatics, as well as ways in which they could be improved in future.

Assuntos

Inteligência Artificial , Aprendizado de Máquina , Algoritmos , Software , Informática

3.

MORTAR: a rich client application for in silico molecule fragmentation.

Bänsch, Felix; Schaub, Jonas; Sevindik, Betül; Behr, Samuel; Zander, Julian; Steinbeck, Christoph; Zielesny, Achim.

J Cheminform ; 15(1): 1, 2023 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-36593523

RESUMO

Developing and implementing computational algorithms for the extraction of specific substructures from molecular graphs (in silico molecule fragmentation) is an iterative process. It involves repeated sequences of implementing a rule set, applying it to relevant structural data, checking the results, and adjusting the rules. This requires a computational workflow with data import, fragmentation algorithm integration, and result visualisation. The described workflow is normally unavailable for a new algorithm and must be set up individually. This work presents an open Java rich client Graphical User Interface (GUI) application to support the development of new in silico molecule fragmentation algorithms and make them readily available upon release. The MORTAR (MOlecule fRagmenTAtion fRamework) application visualises fragmentation results of a set of molecules in various ways and provides basic analysis features. Fragmentation algorithms can be integrated and developed within MORTAR by using a specific wrapper class. In addition, fragmentation pipelines with any combination of the available fragmentation methods can be executed. Upon release, three fragmentation algorithms are already integrated: ErtlFunctionalGroupsFinder, Sugar Removal Utility, and Scaffold Generator. These algorithms, as well as all cheminformatics functionalities in MORTAR, are implemented based on the Chemistry Development Kit (CDK).

4.

An algorithm to classify homologous series within compound datasets.

Lai, Adelene; Schaub, Jonas; Steinbeck, Christoph; Schymanski, Emma L.

J Cheminform ; 14(1): 85, 2022 Dec 13.

Artigo em Inglês | MEDLINE | ID: mdl-36510332

RESUMO

Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple domains, including natural products, environmental chemistry, and drug design. However, many homologous compounds remain unannotated as such in compound datasets, which poses obstacles to understanding chemical diversity and their analytical identification via database matching. To overcome these challenges, an algorithm to detect homologous series within compound datasets was developed and implemented using the RDKit. The algorithm takes a list of molecules as SMILES strings and a monomer (i.e., repeating unit) encoded as SMARTS as its main inputs. In an iterative process, substructure matching of repeating units, molecule fragmentation, and core detection lead to homologous series classification through grouping of identical cores. Three open compound datasets from environmental chemistry (NORMAN Suspect List Exchange, NORMAN-SLE), exposomics (PubChemLite for Exposomics), and natural products (the COlleCtion of Open NatUral producTs, COCONUT) were subject to homologous series classification using the algorithm. Over 2000, 12,000, and 5000 series with CH2 repeating units were classified in the NORMAN-SLE, PubChemLite, and COCONUT respectively. Validation of classified series was performed using published homologous series and structure categories, including a comparison with a similar existing method for categorising PFAS compounds. The OngLai algorithm and its implementation for classifying homologues are openly available at: https://github.com/adelenelai/onglai-classify-homologues .

5.

Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK).

Schaub, Jonas; Zander, Julian; Zielesny, Achim; Steinbeck, Christoph.

J Cheminform ; 14(1): 79, 2022 Nov 10.

Artigo em Inglês | MEDLINE | ID: mdl-36357931

RESUMO

The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of molecular scaffolds, scaffold trees and networks. The new library is based on the Chemistry Development Kit (CDK) and highly customisable through multiple settings, e.g. five different structural framework definitions are available. For display of scaffold hierarchies, the open GraphStream Java library is utilised. Performance snapshots with natural products (NP) from the COCONUT (COlleCtion of Open Natural prodUcTs) database and drug molecules from DrugBank are reported. The generation of a scaffold network from more than 450,000 NP can be achieved within a single day.

6.

Description and Analysis of Glycosidic Residues in the Largest Open Natural Products Database.

Schaub, Jonas; Zielesny, Achim; Steinbeck, Christoph; Sorokina, Maria.

Biomolecules ; 11(4)2021 03 24.

Artigo em Inglês | MEDLINE | ID: mdl-33804966

RESUMO

Natural products (NPs), biomolecules produced by living organisms, inspire the pharmaceutical industry and research due to their structural characteristics and the substituents from which they derive their activities. Glycosidic residues are frequently present in NP structures and have particular pharmacokinetic and pharmacodynamic importance as they improve their solubility and are often involved in molecular transport, target specificity, ligand-target interactions, and receptor binding. The COlleCtion of Open Natural prodUcTs (COCONUT) is currently the largest open database of NPs, and therefore a suitable starting point for the detection and analysis of the diversity of glycosidic residues in NPs. In this work, we report and describe the presence of circular, linear, terminal, and non-terminal glycosidic units in NPs, together with their importance in drug discovery.

Assuntos

Produtos Biológicos/química , Bases de Dados Factuais , Glicosídeos/química , Bactérias/química , Bactérias/metabolismo , Produtos Biológicos/metabolismo , Glicosídeos/metabolismo , Glicosilação , Solubilidade

7.

Too sweet: cheminformatics for deglycosylation in natural products.

Schaub, Jonas; Zielesny, Achim; Steinbeck, Christoph; Sorokina, Maria.

J Cheminform ; 12(1): 67, 2020 Nov 04.

Artigo em Inglês | MEDLINE | ID: mdl-33292501

RESUMO

Sugar units in natural products are pharmacokinetically important but often redundant and therefore obstructing the study of the structure and function of the aglycon. Therefore, it is recommended to remove the sugars before a theoretical or experimental study of a molecule. Deglycogenases, enzymes that specialized in sugar removal from small molecules, are often used in laboratories to perform this task. However, there is no standardized computational procedure to perform this task in silico. In this work, we present a systematic approach for in silico removal of ring and linear sugars from molecular structures. Particular attention is given to molecules of biological origin and to their structural specificities. This approach is made available in two forms, through a free and open web application and as standalone open-source software.

8.

ErtlFunctionalGroupsFinder: automated rule-based functional group detection with the Chemistry Development Kit (CDK).

Fritsch, Sebastian; Neumann, Stefan; Schaub, Jonas; Steinbeck, Christoph; Zielesny, Achim.

J Cheminform ; 11(1): 37, 2019 Jun 04.

Artigo em Inglês | MEDLINE | ID: mdl-31165338

RESUMO

The Ertl algorithm for automated functional groups (FG) detection and extraction of organic molecules is implemented on the basis of the Chemistry Development Kit (CDK). A distinct impact of the chosen CDK aromaticity model is demonstrated by an FG analysis of the ChEMBL database compounds. The average performance of less than a millisecond for a single-molecule FG extraction allows for fast processing of even large compound databases.

9.

SPICES: a particle-based molecular structure line notation and support library for mesoscopic simulation.

van den Broek, Karina; Daniel, Mirco; Epple, Matthias; Kuhn, Hubert; Schaub, Jonas; Zielesny, Achim.

J Cheminform ; 10(1): 35, 2018 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-30094683

RESUMO

Simplified Particle Input ConnEction Specification (SPICES) is a particle-based molecular structure representation derived from straightforward simplifications of the atom-based SMILES line notation. It aims at supporting tedious and error-prone molecular structure definitions for particle-based mesoscopic simulation techniques like Dissipative Particle Dynamics by allowing for an interplay of different molecular encoding levels that range from topological line notations and corresponding particle-graph visualizations to 3D structures with support of their spatial mapping into a simulation box. An open Java library for SPICES structure handling and mesoscopic simulation support in combination with an open Java Graphical User Interface viewer application for visual topological inspection of SPICES definitions are provided.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA