ABSTRACT
Computational molecular generation methods that generate chemical structures from gene expression profiles have been actively developed for de novo drug design. However, most omics-based methods involve complex models consisting of multiple neural networks, which require pretraining. In this study, we propose a straightforward molecular generation method called GxRNN (gene expression profile-based recurrent neural network), employing a single recurrent neural network (RNN) that necessitates no pretraining for omics-based drug design. Specifically, our method utilizes the desired gene expression profile as input for the RNN, conditioning it to generate molecules likely to induce a similar profile. In a case study involving ten target proteins, GxRNN exhibited superior structural reproducibility of known ligands, surpassing several existing methods. This advancement positions our proposed method as a promising tool for facilitating de novo drug design.
ABSTRACT
Peptides are potentially useful modalities of drugs; however, cell membrane permeability is an obstacle in peptide drug discovery. The identification of bioactive peptides for a therapeutic target is also challenging because of the huge amino acid sequence patterns of peptides. In this study, we propose a novel computational method, PEptide generation system using Neural network Trained on Amino acid sequence data and Gaussian process-based optimizatiON (PENTAGON), to automatically generate new peptides with desired bioactivity and cell membrane permeability. In the algorithm, we mapped peptide amino acid sequences onto the latent space constructed using a variational autoencoder and searched for peptides with desired bioactivity and cell membrane permeability using Bayesian optimization. We used our proposed method to generate peptides with cell membrane permeability and bioactivity for each of the nine therapeutic targets, such as the estrogen receptor (ER). Our proposed method outperformed a previously developed peptide generator in terms of similarity to known active peptide sequences and the length of generated peptide sequences.
Subject(s)
Bayes Theorem , Cell Membrane Permeability , Peptides , Peptides/chemistry , Peptides/pharmacology , Amino Acid Sequence , Algorithms , Neural Networks, Computer , HumansABSTRACT
Deep generative models for molecular generation have been gaining much attention as structure generators to accelerate drug discovery. However, most previously developed methods are chemistry-centric approaches, and comprehensive biological responses in the cell have not been taken into account. In this study, we propose a novel computational method, TRIOMPHE-BOA (transcriptome-based inference and generation of molecules with desired phenotypes using the Bayesian optimization algorithm), to generate new chemical structures of inhibitor or activator candidates for therapeutic target proteins by integrating chemically and genetically perturbed transcriptome profiles. In the algorithm, the substructures of multiple molecules that were selected based on the transcriptome analysis are fused in the design of new chemical structures by exploring the latent space of a Transformer-based variational autoencoder using Bayesian optimization. Our results demonstrate the usefulness of the proposed method in terms of having high reproducibility of existing ligands for 10 therapeutic target proteins when compared with previous methods. Moreover, this method can be applied to proteins without detailed 3D structures or known ligands and is expected to become a powerful tool for more efficient hit identification.