Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 111(27): 9875-80, 2014 Jul 08.
Article in English | MEDLINE | ID: mdl-24941953

ABSTRACT

The efficient recognition of pathogens by the adaptive immune system relies on the diversity of receptors displayed at the surface of immune cells. T-cell receptor diversity results from an initial random DNA editing process, called VDJ recombination, followed by functional selection of cells according to the interaction of their surface receptors with self and foreign antigenic peptides. Using high-throughput sequence data from the ß-chain of human T-cell receptors, we infer factors that quantify the overall effect of selection on the elements of receptor sequence composition: the V and J gene choice and the length and amino acid composition of the variable region. We find a significant correlation between biases induced by VDJ recombination and our inferred selection factors together with a reduction of diversity during selection. Both effects suggest that natural selection acting on the recombination process has anticipated the selection pressures experienced during somatic evolution. The inferred selection factors differ little between donors or between naive and memory repertoires. The number of sequences shared between donors is well-predicted by our model, indicating a stochastic origin of such public sequences. Our approach is based on a probabilistic maximum likelihood method, which is necessary to disentangle the effects of selection from biases inherent in the recombination process.


Subject(s)
Receptors, Antigen, T-Cell, alpha-beta/genetics , Selection, Genetic , CD4-Positive T-Lymphocytes/immunology , Humans
2.
Proc Natl Acad Sci U S A ; 109(40): 16161-6, 2012 Oct 02.
Article in English | MEDLINE | ID: mdl-22988065

ABSTRACT

Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.


Subject(s)
Adaptive Immunity/genetics , Antibody Diversity/genetics , CD4-Positive T-Lymphocytes/metabolism , Genes, T-Cell Receptor beta/genetics , Models, Biological , V(D)J Recombination/genetics , Algorithms , Base Sequence , Computational Biology/methods , Humans , Likelihood Functions , Molecular Sequence Data , Sequence Alignment , Sequence Analysis, DNA
3.
Nat Biotechnol ; 30(3): 271-7, 2012 Feb 26.
Article in English | MEDLINE | ID: mdl-22371084

ABSTRACT

Learning to read and write the transcriptional regulatory code is of central importance to progress in genetic analysis and engineering. Here we describe a massively parallel reporter assay (MPRA) that facilitates the systematic dissection of transcriptional regulatory elements. In MPRA, microarray-synthesized DNA regulatory elements and unique sequence tags are cloned into plasmids to generate a library of reporter constructs. These constructs are transfected into cells and tag expression is assayed by high-throughput sequencing. We apply MPRA to compare >27,000 variants of two inducible enhancers in human cells: a synthetic cAMP-regulated enhancer and the virus-inducible interferon-ß enhancer. We first show that the resulting data define accurate maps of functional transcription factor binding sites in both enhancers at single-nucleotide resolution. We then use the data to train quantitative sequence-activity models (QSAMs) of the two enhancers. We show that QSAMs from two cellular states can be combined to design enhancer variants that optimize potentially conflicting objectives, such as maximizing induced activity while minimizing basal activity.


Subject(s)
Biological Assay/methods , Enhancer Elements, Genetic , Genes, Reporter , Transcription Factors/genetics , Base Sequence , Binding Sites , Humans , Models, Genetic , Molecular Sequence Data , Mutagenesis , Sequence Alignment , Transcription Factors/metabolism , Transcription, Genetic
4.
Proc Natl Acad Sci U S A ; 107(20): 9158-63, 2010 May 18.
Article in English | MEDLINE | ID: mdl-20439748

ABSTRACT

Cells use protein-DNA and protein-protein interactions to regulate transcription. A biophysical understanding of this process has, however, been limited by the lack of methods for quantitatively characterizing the interactions that occur at specific promoters and enhancers in living cells. Here we show how such biophysical information can be revealed by a simple experiment in which a library of partially mutated regulatory sequences are partitioned according to their in vivo transcriptional activities and then sequenced en masse. Computational analysis of the sequence data produced by this experiment can provide precise quantitative information about how the regulatory proteins at a specific arrangement of binding sites work together to regulate transcription. This ability to reliably extract precise information about regulatory biophysics in the face of experimental noise is made possible by a recently identified relationship between likelihood and mutual information. Applying our experimental and computational techniques to the Escherichia coli lac promoter, we demonstrate the ability to identify regulatory protein binding sites de novo, determine the sequence-dependent binding energy of the proteins that bind these sites, and, importantly, measure the in vivo interaction energy between RNA polymerase and a DNA-bound transcription factor. Our approach provides a generally applicable method for characterizing the biophysical basis of transcriptional regulation by a specified regulatory sequence. The principles of our method can also be applied to a wide range of other problems in molecular biology.


Subject(s)
Gene Expression Regulation/genetics , Models, Biological , Mutation/genetics , Promoter Regions, Genetic/genetics , Base Sequence , Binding Sites/genetics , Biophysics , Computational Biology/methods , Escherichia coli , Flow Cytometry , Gene Expression Regulation/physiology , Green Fluorescent Proteins/metabolism , Lac Operon/genetics , Likelihood Functions , Molecular Sequence Data , Monte Carlo Method , Sequence Analysis, DNA , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...