RESUMO
This paper introduces BRADSHAW (Biological Response Analysis and Design System using an Heterogenous, Automated Workflow), a system for automated molecular design which integrates methods for chemical structure generation, experimental design, active learning and cheminformatics tools. The simple user interface is designed to facilitate access to large scale automated design whilst minimising software development required to introduce new algorithms, a critical requirement in what is a very fast moving field. The system embodies a philosophy of automation, best practice, experimental design and the use of both traditional cheminformatics and modern machine learning algorithms.
Assuntos
Desenho Assistido por Computador , Desenho de Fármacos , Antagonistas do Receptor A2 de Adenosina/química , Algoritmos , Quimioinformática/métodos , Quimioinformática/estatística & dados numéricos , Quimioinformática/tendências , Desenho Assistido por Computador/estatística & dados numéricos , Desenho Assistido por Computador/tendências , Aprendizado Profundo , Descoberta de Drogas/métodos , Descoberta de Drogas/estatística & dados numéricos , Descoberta de Drogas/tendências , Humanos , Aprendizado de Máquina , Inibidores de Metaloproteinases de Matriz/química , Relação Quantitativa Estrutura-Atividade , Bibliotecas de Moléculas Pequenas , Software , Interface Usuário-Computador , Fluxo de TrabalhoRESUMO
Prediction of compounds that are active against a desired biological target is a common step in drug discovery efforts. Virtual screening methods seek some active-enriched fraction of a library for experimental testing. Where data are too scarce to train supervised learning models for compound prioritization, initial screening must provide the necessary data. Commonly, such an initial library is selected on the basis of chemical diversity by some pseudo-random process (for example, the first few plates of a larger library) or by selecting an entire smaller library. These approaches may not produce a sufficient number or diversity of actives. An alternative approach is to select an informer set of screening compounds on the basis of chemogenomic information from previous testing of compounds against a large number of targets. We compare different ways of using chemogenomic data to choose a small informer set of compounds based on previously measured bioactivity data. We develop this Informer-Based-Ranking (IBR) approach using the Published Kinase Inhibitor Sets (PKIS) as the chemogenomic data to select the informer sets. We test the informer compounds on a target that is not part of the chemogenomic data, then predict the activity of the remaining compounds based on the experimental informer data and the chemogenomic data. Through new chemical screening experiments, we demonstrate the utility of IBR strategies in a prospective test on three kinase targets not included in the PKIS.