Pesquisa | Portal Regional da BVS (teste)

ImmuneBuilder: Deep-Learning models for predicting the structures of immune proteins.

Abanades, Brennan; Wong, Wing Ki; Boyles, Fergus; Georges, Guy; Bujotzek, Alexander; Deane, Charlotte M.

Commun Biol ; 6(1): 575, 2023 05 29.

Artigo em Inglês | MEDLINE | ID: mdl-37248282

RESUMO

Immune receptor proteins play a key role in the immune system and have shown great promise as biotherapeutics. The structure of these proteins is critical for understanding their antigen binding properties. Here, we present ImmuneBuilder, a set of deep learning models trained to accurately predict the structure of antibodies (ABodyBuilder2), nanobodies (NanoBodyBuilder2) and T-Cell receptors (TCRBuilder2). We show that ImmuneBuilder generates structures with state of the art accuracy while being far faster than AlphaFold2. For example, on a benchmark of 34 recently solved antibodies, ABodyBuilder2 predicts CDR-H3 loops with an RMSD of 2.81Å, a 0.09Å improvement over AlphaFold-Multimer, while being over a hundred times faster. Similar results are also achieved for nanobodies, (NanoBodyBuilder2 predicts CDR-H3 loops with an average RMSD of 2.89Å, a 0.55Å improvement over AlphaFold2) and TCRs. By predicting an ensemble of structures, ImmuneBuilder also gives an error estimate for every residue in its final prediction. ImmuneBuilder is made freely available, both to download ( https://github.com/oxpig/ImmuneBuilder ) and to use via our webserver ( http://opig.stats.ox.ac.uk/webapps/newsabdab/sabpred ). We also make available structural models for ~150 thousand non-redundant paired antibody sequences ( https://doi.org/10.5281/zenodo.7258553 ).

Assuntos

Aprendizado Profundo , Anticorpos de Domínio Único , Modelos Moleculares , Anticorpos , Receptores de Antígenos de Linfócitos T

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses.

Boyles, Fergus; Deane, Charlotte M; Morris, Garrett M.

J Chem Inf Model ; 62(22): 5329-5341, 2022 11 28.

Artigo em Inglês | MEDLINE | ID: mdl-34469150

RESUMO

Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked rather than crystallographic poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. We also present a new, freely available validation setâthe Updated DUD-E Diverse Subsetâfor binding affinity prediction using data from DUD-E and ChEMBL. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function sometimes generalizes poorly to a protein target not represented in the training set, demonstrating the need for improved scoring functions and additional validation benchmarks.

Assuntos

Aprendizado de Máquina , Proteínas , Ligantes , Ligação Proteica , Proteínas/química , Simulação de Acoplamento Molecular

Observed Antibody Space: A diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences.

Olsen, Tobias H; Boyles, Fergus; Deane, Charlotte M.

Protein Sci ; 31(1): 141-146, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-34655133

RESUMO

The antibody repertoires of individuals and groups have been used to explore disease states, understand vaccine responses, and drive therapeutic development. The arrival of B-cell receptor repertoire sequencing has enabled researchers to get a snapshot of these antibody repertoires, and as more data are generated, increasingly in-depth studies are possible. However, most publicly available data only exist as raw FASTQ files, making the data hard to access, process, and compare. The Observed Antibody Space (OAS) database was created in 2018 to offer clean, annotated, and translated repertoire data. In this paper, we describe an update to OAS that has been driven by the increasing volume of data and the appearance of paired (VH/VL) sequence data. OAS is now accessible via a new web server, with standardized search parameters and a new sequence-based search option. The new database provides both nucleotides and amino acids for every sequence, with additional sequence annotations to make the data Minimal Information about Adaptive Immune Receptor Repertoire compliant, and comments on potential problems with the sequence. OAS now contains 25 new studies, including severe acute respiratory syndrome coronavirus 2 data and paired sequencing data. The new database is accessible at http://opig.stats.ox.ac.uk/webapps/oas/, and all data are freely available for download.

Assuntos

Anticorpos/química , Bases de Dados de Proteínas , Sequência de Aminoácidos , Animais , Anticorpos/imunologia , COVID-19/imunologia , Humanos , Cadeias Pesadas de Imunoglobulinas/química , Cadeias Pesadas de Imunoglobulinas/imunologia , Cadeias Leves de Imunoglobulina/química , Cadeias Leves de Imunoglobulina/imunologia , Região Variável de Imunoglobulina/química , Região Variável de Imunoglobulina/imunologia , SARS-CoV-2/imunologia

Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Boyles, Fergus; Deane, Charlotte M; Morris, Garrett M.

Bioinformatics ; 36(3): 758-764, 2020 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-31598630

RESUMO

MOTIVATION: Machine learning scoring functions for protein-ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein-ligand complex, with limited information about the chemical or topological properties of the ligand itself. RESULTS: We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets. AVAILABILITY AND IMPLEMENTATION: Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado de Máquina , Proteínas , Ligantes , Ligação Proteica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA