Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Language
Publication year range
1.
Chembiochem ; 18(12): 1087-1097, 2017 06 19.
Article in English | MEDLINE | ID: mdl-28371130

ABSTRACT

In directed evolution (DE) the assessment of candidate enzymes and their modification is essential. In this study we have investigated genetic algorithms (GAs) in this context and conducted a systematic study of the behavior of GAs on 20 fitness landscapes (FLs) of varying complexity. This has allowed the tuning of the GAs to be explored. On the basis of this study, recommendations for the best GA settings to use for a GA-directed high-throughput experimental program (in which populations and the number of generations is necessarily low) are reported. The FLs were based upon simple linear models and were characterized by the behavior of the GA on the landscape as demonstrated by stall plots and the footprints and adhesion of candidate solutions, which highlighted local optima (LOs). In order to maximize progress of the GA and to reduce the chances of becoming stuck in a LO it was best to use: 1) a large number of generations, 2) high populations, 3) removal of duplicate sequences (clones), 4) double mutation, and 5) high selection pressure (the two best individuals go to the next generation), and 6) to consider using a designed sequence as the starting point of the GA run. We believe that these recommendations might be appropriate starting points for studies employing GAs within DE experiments.


Subject(s)
Algorithms , Directed Molecular Evolution/statistics & numerical data , Epoxide Hydrolases/genetics , Models, Genetic , Epoxide Hydrolases/metabolism , Gene Expression , High-Throughput Screening Assays , Linear Models , Mutation , Principal Component Analysis
2.
Proc Natl Acad Sci U S A ; 113(13): 3482-7, 2016 Mar 29.
Article in English | MEDLINE | ID: mdl-26969726

ABSTRACT

Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different "frameworks" typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution).


Subject(s)
Directed Molecular Evolution/methods , Peptide Library , Proteins/chemistry , Proteins/genetics , Amino Acid Sequence , Animals , Cell Surface Display Techniques , Directed Molecular Evolution/statistics & numerical data , Escherichia coli/genetics , High-Throughput Nucleotide Sequencing , Humans , Models, Statistical , Molecular Sequence Data , Reproducibility of Results , Sequence Alignment
3.
J Chem Inf Model ; 54(1): 49-56, 2014 Jan 27.
Article in English | MEDLINE | ID: mdl-24372539

ABSTRACT

This paper describes a similarity-driven simple evolutionary approach to producing candidate molecules of new drugs. The aim of the method is to explore the candidates that are structurally similar to the reference molecule and yet somewhat different in not only peripheral chains but also their scaffolds. The method employs a known active molecule of our interest as a reference molecule which is used to navigate a huge chemical space. The reference molecule is also used to obtain seed fragments. An initial set of individual structures is prepared with the seed fragments and additional fragments using several connection rules. The fragment library is preferably prepared from a collection of known molecules related to the target of the reference molecule. Every fragment of the library can be used for fragment-based mutation. All the fragments are categorized into three classes; rings, linkers, and side chains. New individuals are produced by the crossover and the fragment-based mutation with the fragment library. Computer experiments with our own fragment library prepared from GPCR SARfari verified the feasibility of our approach to drug discovery.


Subject(s)
Directed Molecular Evolution/statistics & numerical data , Drug Design , Algorithms , Animals , Computational Biology , Computer Simulation , Databases, Pharmaceutical , Drug Discovery/statistics & numerical data , Humans , Ligands , Models, Chemical , Molecular Structure , Mutation , Quantitative Structure-Activity Relationship , Rats , Receptor, Adenosine A2A/drug effects , Receptor, Serotonin, 5-HT1A/drug effects
4.
PLoS One ; 7(5): e36948, 2012.
Article in English | MEDLINE | ID: mdl-22606313

ABSTRACT

Peptide ligands of G protein-coupled receptors constitute valuable natural lead structures for the development of highly selective drugs and high-affinity tools to probe ligand-receptor interaction. Currently, pharmacological and metabolic modification of natural peptides involves either an iterative trial-and-error process based on structure-activity relationships or screening of peptide libraries that contain many structural variants of the native molecule. Here, we present a novel neural network architecture for the improvement of metabolic stability without loss of bioactivity. In this approach the peptide sequence determines the topology of the neural network and each cell corresponds one-to-one to a single amino acid of the peptide chain. Using a training set, the learning algorithm calculated weights for each cell. The resulting network calculated the fitness function in a genetic algorithm to explore the virtual space of all possible peptides. The network training was based on gradient descent techniques which rely on the efficient calculation of the gradient by back-propagation. After three consecutive cycles of sequence design by the neural network, peptide synthesis and bioassay this new approach yielded a ligand with 70fold higher metabolic stability compared to the wild type peptide without loss of the subnanomolar activity in the biological assay. Combining specialized neural networks with an exploration of the combinatorial amino acid sequence space by genetic algorithms represents a novel rational strategy for peptide design and optimization.


Subject(s)
Directed Molecular Evolution/methods , Neural Networks, Computer , Receptors, G-Protein-Coupled/genetics , Receptors, G-Protein-Coupled/metabolism , Algorithms , Amino Acid Sequence , Calcium/metabolism , Directed Molecular Evolution/statistics & numerical data , HEK293 Cells , Humans , Ligands , Oligopeptides/chemistry , Oligopeptides/metabolism , Protein Stability , Receptors, G-Protein-Coupled/chemistry , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism
8.
Protein Eng ; 16(6): 451-7, 2003 Jun.
Article in English | MEDLINE | ID: mdl-12874379

ABSTRACT

Directed evolution of proteins depends on the production of molecular diversity by random mutagenesis. While a number of methods have been developed for introducing this diversity, the best ways to sample it are not always clear. Here we used simple statistics to analyse completeness and diversity in randomized libraries generated by oligonucleotide-directed mutagenesis, error-prone polymerase chain reaction (epPCR) and in vitro recombination of highly homologous sequences. For oligonucleotide-directed mutagenesis, we derive equations to estimate how complete a given library is expected to be and also to predict the size of library required to give a fixed probability of being 100% complete. We describe the statistical bases for computer programs which estimate the number of distinct variants represented in epPCR and shuffled libraries, dubbed PEDEL and DRIVeR, respectively. These programs allow the user to calculate (rather than guess) the diversity represented in a given library and also provide empirical guidelines for maximizing this diversity. PEDEL and DRIVeR are available at www.bio.cam.ac.uk/ approximately blackburn/stats.html.


Subject(s)
Algorithms , Directed Molecular Evolution/statistics & numerical data , Gene Library , Point Mutation , Polymerase Chain Reaction , Proteins/genetics , Computer Simulation , Genetic Variation , Mutagenesis
9.
J Comput Biol ; 7(1-2): 143-58, 2000.
Article in English | MEDLINE | ID: mdl-10890392

ABSTRACT

Error-prone polymerase chain reaction (PCR) is widely used to introduce point mutations during in vitro evolution experiments. Accurate estimation of the mutation rate during error-prone PCR is important in studying the diversity of error-prone PCR product. Although many methods for estimating the mutation rate during PCR are available, all the existing methods depend on the assumption that the mutation rate is low and mutations occur at different places whenever they occur. The available methods may not be applicable to estimate the mutation rate during error-prone PCR. We develop a mathematical model for error-prone PCR and present methods to estimate the mutation rate during error-prone PCR without assuming low mutation rate. We also develop a computer program to simulate error-prone PCR. Using the program, we compare the newly developed methods with two other methods. We show that when the mutation rate is relatively low (< 10(-3) per base per PCR cycle), the newly developed methods give roughly the same results as previous methods. When the mutation rate is relatively high (> 5 x 10(-3) per base per PCR cycle, the mutation rate for most error-prone PCR experiments), the previous methods underestimate the mutation rate and the newly developed methods approximate the true mutation rate.


Subject(s)
Directed Molecular Evolution/statistics & numerical data , Point Mutation , Polymerase Chain Reaction , Biometry , Computer Simulation , Models, Statistical , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...