Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Science ; 379(6637): 1123-1130, 2023 03 17.
Article in English | MEDLINE | ID: mdl-36927031

ABSTRACT

Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic protein sequences, including >225 million that are predicted with high confidence, which gives a view into the vast breadth and diversity of natural proteins.


Subject(s)
Evolution, Molecular , Machine Learning , Proteins , Sequence Analysis, Protein , Amino Acid Sequence , Proteins/chemistry , Protein Conformation
3.
Proc Natl Acad Sci U S A ; 118(15)2021 04 13.
Article in English | MEDLINE | ID: mdl-33876751

ABSTRACT

In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In the life sciences, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Protein language modeling at the scale of evolution is a logical step toward predictive and generative artificial intelligence for biology. To this end, we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million protein sequences spanning evolutionary diversity. The resulting model contains information about biological properties in its representations. The representations are learned from sequence data alone. The learned representation space has a multiscale organization reflecting structure from the level of biochemical properties of amino acids to remote homology of proteins. Information about secondary and tertiary structure is encoded in the representations and can be identified by linear projections. Representation learning produces features that generalize across a range of applications, enabling state-of-the-art supervised prediction of mutational effect and secondary structure and improving state-of-the-art features for long-range contact prediction.


Subject(s)
Sequence Analysis, Protein/methods , Unsupervised Machine Learning , Amino Acids/chemistry , Protein Conformation , Sequence Homology, Amino Acid
4.
Nat Biomed Eng ; 5(6): 613-623, 2021 06.
Article in English | MEDLINE | ID: mdl-33707779

ABSTRACT

The de novo design of antimicrobial therapeutics involves the exploration of a vast chemical repertoire to find compounds with broad-spectrum potency and low toxicity. Here, we report an efficient computational method for the generation of antimicrobials with desired attributes. The method leverages guidance from classifiers trained on an informative latent space of molecules modelled using a deep generative autoencoder, and screens the generated molecules using deep-learning classifiers as well as physicochemical features derived from high-throughput molecular dynamics simulations. Within 48 days, we identified, synthesized and experimentally tested 20 candidate antimicrobial peptides, of which two displayed high potency against diverse Gram-positive and Gram-negative pathogens (including multidrug-resistant Klebsiella pneumoniae) and a low propensity to induce drug resistance in Escherichia coli. Both peptides have low toxicity, as validated in vitro and in mice. We also show using live-cell confocal imaging that the bactericidal mode of action of the peptides involves the formation of membrane pores. The combination of deep learning and molecular dynamics may accelerate the discovery of potent and selective broad-spectrum antimicrobials.


Subject(s)
Anti-Bacterial Agents/pharmacology , Antimicrobial Cationic Peptides/pharmacology , Deep Learning , Drug Design , Drug Discovery/methods , Drug Resistance, Bacterial/drug effects , Acinetobacter baumannii/drug effects , Acinetobacter baumannii/growth & development , Acinetobacter baumannii/ultrastructure , Amino Acid Sequence , Animals , Anti-Bacterial Agents/chemical synthesis , Antimicrobial Cationic Peptides/chemical synthesis , Escherichia coli/drug effects , Escherichia coli/growth & development , Escherichia coli/ultrastructure , Female , Klebsiella Infections/drug therapy , Klebsiella pneumoniae/drug effects , Klebsiella pneumoniae/growth & development , Klebsiella pneumoniae/ultrastructure , Mice , Mice, Inbred BALB C , Microbial Sensitivity Tests , Molecular Dynamics Simulation , Pseudomonas aeruginosa/drug effects , Pseudomonas aeruginosa/growth & development , Pseudomonas aeruginosa/ultrastructure , Staphylococcus aureus/drug effects , Staphylococcus aureus/growth & development , Staphylococcus aureus/ultrastructure , Structure-Activity Relationship
SELECTION OF CITATIONS
SEARCH DETAIL
...