Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 40(7)2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38984742

ABSTRACT

MOTIVATION: Identifying the binding sites of antibodies is essential for developing vaccines and synthetic antibodies. In this article, we investigate the optimal representation for predicting the binding sites in the two molecules and emphasize the importance of geometric information. RESULTS: Specifically, we compare different geometric deep learning methods applied to proteins' inner (I-GEP) and outer (O-GEP) structures. We incorporate 3D coordinates and spectral geometric descriptors as input features to fully leverage the geometric information. Our research suggests that different geometrical representation information is useful for different tasks. Surface-based models are more efficient in predicting the binding of the epitope, while graph models are better in paratope prediction, both achieving significant performance improvements. Moreover, we analyze the impact of structural changes in antibodies and antigens resulting from conformational rearrangements or reconstruction errors. Through this investigation, we showcase the robustness of geometric deep learning methods and spectral geometric descriptors to such perturbations. AVAILABILITY AND IMPLEMENTATION: The python code for the models, together with the data and the processing pipeline, is open-source and available at https://github.com/Marco-Peg/GEP.


Subject(s)
Deep Learning , Epitopes , Epitopes/chemistry , Computational Biology/methods , Protein Conformation , Antibodies/chemistry , Antibodies/immunology , Software , Binding Sites
2.
Nat Commun ; 15(1): 1906, 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38503774

ABSTRACT

Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI's model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning.


Subject(s)
Athletic Performance , Athletic Performance/physiology , Qualitative Research , Soccer
3.
Proc Mach Learn Res ; 202: 1341-1360, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37810517

ABSTRACT

Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding "slow nodes" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.

4.
Nature ; 620(7972): 47-60, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37532811

ABSTRACT

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.


Subject(s)
Artificial Intelligence , Research Design , Artificial Intelligence/standards , Artificial Intelligence/trends , Datasets as Topic , Deep Learning , Research Design/standards , Research Design/trends , Unsupervised Machine Learning
6.
Curr Opin Struct Biol ; 79: 102538, 2023 04.
Article in English | MEDLINE | ID: mdl-36764042

ABSTRACT

In many ways, graphs are the main modality of data we receive from nature. This is due to the fact that most of the patterns we see, both in natural and artificial systems, are elegantly representable using the language of graph structures. Prominent examples include molecules (represented as graphs of atoms and bonds), social networks and transportation networks. This potential has already been seen by key scientific and industrial groups, with already-impacted application areas including traffic forecasting, drug discovery, social network analysis and recommender systems. Further, some of the most successful domains of application for machine learning in previous years-images, text and speech processing-can be seen as special cases of graph representation learning, and consequently there has been significant exchange of information between these areas. The main aim of this short survey is to enable the reader to assimilate the key concepts in the area, and position graph representation learning in a proper context with related fields.


Subject(s)
Machine Learning , Neural Networks, Computer , Drug Discovery
7.
Nature ; 600(7887): 70-74, 2021 12.
Article in English | MEDLINE | ID: mdl-34853458

ABSTRACT

The practice of mathematics involves discovering patterns and using these to formulate and prove conjectures, resulting in theorems. Since the 1960s, mathematicians have used computers to assist in the discovery of patterns and formulation of conjectures1, most famously in the Birch and Swinnerton-Dyer conjecture2, a Millennium Prize Problem3. Here we provide examples of new fundamental results in pure mathematics that have been discovered with the assistance of machine learning-demonstrating a method by which machine learning can aid mathematicians in discovering new conjectures and theorems. We propose a process of using machine learning to discover potential patterns and relations between mathematical objects, understanding them with attribution techniques and using these observations to guide intuition and propose conjectures. We outline this machine-learning-guided framework and demonstrate its successful application to current research questions in distinct areas of pure mathematics, in each case showing how it led to meaningful mathematical contributions on important open problems: a new connection between the algebraic and geometric structure of knots, and a candidate algorithm predicted by the combinatorial invariance conjecture for symmetric groups4. Our work may serve as a model for collaboration between the fields of mathematics and artificial intelligence (AI) that can achieve surprising results by leveraging the respective strengths of mathematicians and machine learning.

8.
Patterns (N Y) ; 2(7): 100273, 2021 Jul 09.
Article in English | MEDLINE | ID: mdl-34286298

ABSTRACT

We present neural algorithmic reasoning-the art of building neural networks that are able to execute algorithmic computation-and provide our opinion on its transformative potential for running classical algorithms on inputs previously considered inaccessible to them.

9.
PLoS One ; 15(2): e0228962, 2020.
Article in English | MEDLINE | ID: mdl-32084166

ABSTRACT

ChronoMID-neural networks for temporally-varying, hence Chrono, Medical Imaging Data-makes the novel application of cross-modal convolutional neural networks (X-CNNs) to the medical domain. In this paper, we present multiple approaches for incorporating temporal information into X-CNNs and compare their performance in a case study on the classification of abnormal bone remodelling in mice. Previous work developing medical models has predominantly focused on either spatial or temporal aspects, but rarely both. Our models seek to unify these complementary sources of information and derive insights in a bottom-up, data-driven approach. As with many medical datasets, the case study herein exhibits deep rather than wide data; we apply various techniques, including extensive regularisation, to account for this. After training on a balanced set of approximately 70000 images, two of the models-those using difference maps from known reference points-outperformed a state-of-the-art convolutional neural network baseline by over 30pp (> 99% vs. 68.26%) on an unseen, balanced validation set comprising around 20000 images. These models are expected to perform well with sparse data sets based on both previous findings with X-CNNs and the representations of time used, which permit arbitrarily large and irregular gaps between data points. Our results highlight the importance of identifying a suitable description of time for a problem domain, as unsuitable descriptors may not only fail to improve a model, they may in fact confound it.


Subject(s)
Bone Remodeling/physiology , Imaging, Three-Dimensional/methods , Machine Learning/statistics & numerical data , Animals , Data Interpretation, Statistical , Deep Learning , Female , Mice , Mice, Inbred C57BL , Models, Theoretical , Neural Networks, Computer , Spatio-Temporal Analysis , X-Ray Microtomography/methods
10.
IEEE Trans Neural Netw Learn Syst ; 31(9): 3711-3720, 2020 09.
Article in English | MEDLINE | ID: mdl-31722495

ABSTRACT

In recent years, there have been numerous developments toward solving multimodal tasks, aiming to learn a stronger representation than through a single modality. Certain aspects of the data can be particularly useful in this case-for example, correlations in the space or time domain across modalities-but should be wisely exploited in order to benefit from their full predictive potential. We propose two deep learning architectures with multimodal cross connections that allow for dataflow between several feature extractors (XFlow). Our models derive more interpretable features and achieve better performances than models that do not exchange representations, usefully exploiting correlations between audio and visual data, which have a different dimensionality and are nontrivially exchangeable. This article improves on the existing multimodal deep learning algorithms in two essential ways: 1) it presents a novel method for performing cross modality (before features are learned from individual modalities) and 2) extends the previously proposed cross connections that only transfer information between the streams that process compatible data. Illustrating some of the representations learned by the connections, we analyze their contribution to the increase in discrimination ability and reveal their compatibility with a lip-reading network intermediate representation. We provide the research community with Digits, a new data set consisting of three data types extracted from videos of people saying the digits 0-9. Results show that both cross-modal architectures outperform their baselines (by up to 11.5%) when evaluated on the AVletters, CUAVE, and Digits data sets, achieving the state-of-the-art results.

11.
J Comput Biol ; 26(6): 536-545, 2019 06.
Article in English | MEDLINE | ID: mdl-30508394

ABSTRACT

Antibodies are a critical part of the immune system, having the function of recognizing and mediating the neutralization of undesirable molecules (antigens) for future destruction. Being able to predict which amino acids belong to the paratope , the region on the antibody that binds to the antigen, can facilitate antibody engineering and predictions of antibody-antigen structures. The suitability of deep neural networks has recently been confirmed for this task, with Parapred outperforming all prior models. In this work, we first significantly outperform the computational efficiency of Parapred by leveraging à trous convolutions and self-attention. Second, we implement cross-modal attention by allowing the antibody residues to attend over antigen residues. This leads to new state-of-the-art results in paratope prediction, along with novel opportunities to interpret the outcome of the prediction.


Subject(s)
Antigens/metabolism , Binding Sites, Antibody/physiology , Antibodies , Models, Molecular , Neural Networks, Computer , Protein Conformation
12.
Bioinformatics ; 34(17): 2944-2950, 2018 09 01.
Article in English | MEDLINE | ID: mdl-29672675

ABSTRACT

Motivation: Antibodies play essential roles in the immune system of vertebrates and are powerful tools in research and diagnostics. While hypervariable regions of antibodies, which are responsible for binding, can be readily identified from their amino acid sequence, it remains challenging to accurately pinpoint which amino acids will be in contact with the antigen (the paratope). Results: In this work, we present a sequence-based probabilistic machine learning algorithm for paratope prediction, named Parapred. Parapred uses a deep-learning architecture to leverage features from both local residue neighbourhoods and across the entire sequence. The method significantly improves on the current state-of-the-art methodology, and only requires a stretch of amino acid sequence corresponding to a hypervariable region as an input, without any information about the antigen. We further show that our predictions can be used to improve both speed and accuracy of a rigid docking algorithm. Availability and implementation: The Parapred method is freely available as a webserver at http://www-mvsoftware.ch.cam.ac.uk/and for download at https://github.com/eliberis/parapred. Supplementary information: Supplementary information is available at Bioinformatics online.


Subject(s)
Antibodies/chemistry , Algorithms , Amino Acid Sequence , Antibodies/immunology , Binding Sites, Antibody , Deep Learning , Machine Learning , Models, Molecular , Neural Networks, Computer
13.
Bioinformatics ; 32(16): 2562-4, 2016 08 15.
Article in English | MEDLINE | ID: mdl-27153633

ABSTRACT

MOTIVATION: With the development of experimental methods and technology, we are able to reliably gain access to data in larger quantities, dimensions and types. This has great potential for the improvement of machine learning (as the learning algorithms have access to a larger space of information). However, conventional machine learning approaches used thus far on single-dimensional data inputs are unlikely to be expressive enough to accurately model the problem in higher dimensions; in fact, it should generally be most suitable to represent our underlying models as some form of complex networksng;nsio with nontrivial topological features. As the first step in establishing such a trend, we present MUXSTEP: , an open-source library utilising multiplex networks for the purposes of binary classification on multiple data types. The library is designed to be used out-of-the-box for developing models based on the multiplex network framework, as well as easily modifiable to suit problem modelling needs that may differ significantly from the default approach described. AVAILABILITY AND IMPLEMENTATION: The full source code is available on GitHub: https://github.com/PetarV-/muxstep CONTACT: petar.velickovic@cl.cam.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Software , Machine Learning , Programming Languages
SELECTION OF CITATIONS
SEARCH DETAIL
...