Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters











Database
Language
Publication year range
1.
J Chem Theory Comput ; 17(12): 7962-7971, 2021 Dec 14.
Article in English | MEDLINE | ID: mdl-34793168

ABSTRACT

An unsolved challenge in the development of antigen-specific immunotherapies is determining the optimal antigens to target. Comprehension of antigen-major histocompatibility complex (MHC) binding is paramount toward achieving this goal. Here, we apply CASTELO, a combined machine learning-molecular dynamics (ML-MD) approach, to identify per-residue antigen binding contributions and then design novel antigens of increased MHC-II binding affinity for a type 1 diabetes-implicated system. We build upon a small-molecule lead optimization algorithm by training a convolutional variational autoencoder (CVAE) on MD trajectories of 48 different systems across four antigens and four HLA serotypes. We develop several new machine learning metrics including a structure-based anchor residue classification model as well as cluster comparison scores. ML-MD predictions agree well with experimental binding results and free energy perturbation-predicted binding affinities. Moreover, ML-MD metrics are independent of traditional MD stability metrics such as contact area and root-mean-square fluctuations (RMSF), which do not reflect binding affinity data. Our work supports the role of structure-based deep learning techniques in antigen-specific immunotherapy design.


Subject(s)
Machine Learning , Peptides , Algorithms , Molecular Dynamics Simulation , Peptides/chemistry , Protein Binding
2.
BMC Bioinformatics ; 22(1): 338, 2021 Jun 22.
Article in English | MEDLINE | ID: mdl-34157976

ABSTRACT

BACKGROUND: Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that partially automates lead optimization workflow in silico, providing suggestions for modification hot spots. RESULTS: The initial data collection is achieved with physics-based molecular dynamics simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional and CVAE-based clustering methods are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. CONCLUSION: With no need for extensive structure-activity data, our method provides new hints for drug modification hotspots which can be used to improve drug potency and reduce the lead optimization time. It can potentially become a valuable tool for medicinal chemists.


Subject(s)
Machine Learning , Molecular Dynamics Simulation , Cluster Analysis , Drug Discovery
3.
Comput Methods Programs Biomed ; 126: 20-34, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26724853

ABSTRACT

BACKGROUND: Knowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly benefit from the availability of gene and protein functional feature descriptions expressed through controlled terminologies and ontologies, i.e., of gene and protein biomedical controlled annotations. In the last years, several databases of such annotations have become available; yet, these valuable annotations are incomplete, include errors and only some of them represent highly reliable human curated information. Computational techniques able to reliably predict new gene or protein annotations with an associated likelihood value are thus paramount. METHODS: Here, we propose a novel cross-organisms learning approach to reliably predict new functionalities for the genes of an organism based on the known controlled annotations of the genes of another, evolutionarily related and better studied, organism. We leverage a new representation of the annotation discovery problem and a random perturbation of the available controlled annotations to allow the application of supervised algorithms to predict with good accuracy unknown gene annotations. Taking advantage of the numerous gene annotations available for a well-studied organism, our cross-organisms learning method creates and trains better prediction models, which can then be applied to predict new gene annotations of a target organism. RESULTS: We tested and compared our method with the equivalent single organism approach on different gene annotation datasets of five evolutionarily related organisms (Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum). Results show both the usefulness of the perturbation method of available annotations for better prediction model training and a great improvement of the cross-organism models with respect to the single-organism ones, without influence of the evolutionary distance between the considered organisms. The generated ranked lists of reliably predicted annotations, which describe novel gene functionalities and have an associated likelihood value, are very valuable both to complement available annotations, for better coverage in biomedical knowledge discovery analyses, and to quicken the annotation curation process, by focusing it on the prioritized novel annotations predicted.


Subject(s)
Computational Biology/methods , Molecular Sequence Annotation/methods , Algorithms , Animals , Cattle , Chickens , Computer Simulation , Computers , Dictyostelium , Genomics , Humans , Likelihood Functions , Mice , Models, Statistical , Phenotype , Phylogeny , Proteomics , Reproducibility of Results , Semantics , Software , Species Specificity , Vocabulary, Controlled
4.
BMC Bioinformatics ; 16: 346, 2015 Oct 28.
Article in English | MEDLINE | ID: mdl-26511083

ABSTRACT

BACKGROUND: Functional annotation of genes and gene products is a major challenge in the post-genomic era. Nowadays, gene function curation is largely based on manual assignment of Gene Ontology (GO) annotations to genes by using published literature. The annotation task is extremely time-consuming, therefore there is an increasing interest in automated tools that can assist human experts. RESULTS: Here we introduce GOTA, a GO term annotator for biomedical literature. The proposed approach makes use only of information that is readily available from public repositories and it is easily expandable to handle novel sources of information. We assess the classification capabilities of GOTA on a large benchmark set of publications. The overall performances are encouraging in comparison to the state of the art in multi-label classification over large taxonomies. Furthermore, the experimental tests provide some interesting insights into the potential improvement of automated annotation tools. CONCLUSIONS: GOTA implements a flexible and expandable model for GO annotation of biomedical literature. The current version of the GOTA tool is freely available at http://gota.apice.unibo.it.


Subject(s)
User-Computer Interface , Animals , Data Mining , Gene Ontology , Humans , Internet , Molecular Sequence Annotation
SELECTION OF CITATIONS
SEARCH DETAIL