Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
Add more filters










Publication year range
1.
Biotechnol Bioeng ; 119(6): 1392-1404, 2022 06.
Article in English | MEDLINE | ID: mdl-35249214

ABSTRACT

Chinese Hamster Ovary (CHO) cells are widely used for the high-level production of recombinant proteins. We created a multiauxotrophic mutant of CHO-K1 cells, CHO8A, that is deficient in eight enzymatic steps in the purine/pyrimidine biosynthetic pathways. Prototrophy was restored by transfections with complementary DNA-based genes for the eight missing activities. CHO8A cells permit: (1) selection of transfectant clones that have incorporated genes for eight or more different polypeptides, suitable for engineering complex proteins, or pathways; and (2) the single-step selection of high producers of a particular protein. The latter is achieved by simultaneous use of eight vectors, each bearing one of the eight rescue genes and a cargo protein gene. Screening as few as 10 surviving colonies yielded high producers secreting mAbs at 84 picograms per cell per day or more. CHO8A was isolated by CRISPR-Cas9 knockout of 10 genes in the pathways to pyrimidines (Dhodh, Umps, Ctps1, Ctps2, and Tyms) and purines (Paics, Atic, Impdh1, Impdh2, and Gmps).


Subject(s)
Protein Engineering , Animals , CHO Cells , Cricetinae , Cricetulus , Recombinant Proteins/metabolism , Transfection
2.
Biotechnol Bioeng ; 117(8): 2401-2409, 2020 08.
Article in English | MEDLINE | ID: mdl-32346859

ABSTRACT

Chinese hamster ovary (CHO) cells are the most widely used mammalian hosts for recombinant protein production due to their hardiness, ease of transfection, and production of glycan structures similar to those in natural human monoclonal antibodies. To enhance the usefulness of CHO-K1 cells we developed a new selection system based on double auxotrophy. We used CRISPR-Cas9 to knockout the genes that encode the bifunctional enzymes catalyzing the last two steps in the de novo synthesis of pyrimidines and purines (uridine monophosphate synthase and 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase/IMP cyclohydrolase [ATIC], respectively). Survival of these doubly auxotrophic cells depends on the provision of sources of purines and pyrimidines or on the transfection and integration of open reading frames encoding these two enzymes. We successfully used one such double auxotroph (UA10) to select for stable transfectants carrying (a) the recombinant tumor necrosis factor-α receptor fusion protein etanercept and (b) the heavy and light chains of the anti-Her2 monoclonal antibody trastuzumab. Transfectant clones produced these recombinant proteins in a stable manner and in substantial amounts. The availability of this double auxotroph provides a rapid and efficient selection method for the serial or simultaneous transfer of genes for multiple polypeptides of choice into CHO cells using readily available purine- and pyrimidine-free commercial media.


Subject(s)
Antibodies, Monoclonal , Genetic Engineering/methods , Recombinant Proteins , Animals , Antibodies, Monoclonal/genetics , Antibodies, Monoclonal/metabolism , CHO Cells , CRISPR-Cas Systems , Cell Line , Cricetinae , Cricetulus , Gene Knockout Techniques , Purines/metabolism , Pyrimidines/metabolism , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Transfection
3.
Genome Res ; 28(1): 11-24, 2018 01.
Article in English | MEDLINE | ID: mdl-29242188

ABSTRACT

To illuminate the extent and roles of exonic sequences in the splicing of human RNA transcripts, we conducted saturation mutagenesis of a 51-nt internal exon in a three-exon minigene. All possible single and tandem dinucleotide substitutions were surveyed. Using high-throughput genetics, 5560 minigene molecules were assayed for splicing in human HEK293 cells. Up to 70% of mutations produced substantial (greater than twofold) phenotypes of either increased or decreased splicing. Of all predicted secondary structural elements, only a single 15-nt stem-loop showed a strong correlation with splicing, acting negatively. The in vitro formation of exon-protein complexes between the mutant molecules and proteins associated with spliceosome formation (U2AF35, U2AF65, U1A, and U1-70K) correlated with splicing efficiencies, suggesting exon definition as the step affected by most mutations. The measured relative binding affinities of dozens of human RNA binding protein domains as reported in the CISBP-RNA database were found to correlate either positively or negatively with splicing efficiency, more than could fit on the 51-nt test exon simultaneously. The large number of these functional protein binding correlations point to a dynamic and heterogeneous population of pre-mRNA molecules, each responding to a particular collection of binding proteins.


Subject(s)
Databases, Genetic , Exons/physiology , RNA Precursors , RNA Splicing Factors , RNA Splicing/physiology , HEK293 Cells , Humans , Protein Domains , RNA Precursors/chemistry , RNA Precursors/genetics , RNA Precursors/metabolism , RNA Splicing Factors/chemistry , RNA Splicing Factors/genetics , RNA Splicing Factors/metabolism
4.
Biochem Pharmacol ; 110-111: 16-24, 2016 06 15.
Article in English | MEDLINE | ID: mdl-27063945

ABSTRACT

Correction of point mutations that lead to aberrant transcripts, often with pathological consequences, has been the focus of considerable research. In this work, repair-PPRHs are shown to be a new powerful tool for gene correction. A repair-PPRH consists of a PolyPurine Reverse Hoogsteen hairpin core bearing an extension sequence at one end, homologous to the DNA strand to be repaired but containing the wild type nucleotide instead of the mutation. Previously, we had corrected a single-point mutation with repair-PPRHs using a mutated version of a dihydrofolate reductase (dhfr) minigene. To further evaluate the utility of these molecules, different repair-PPRHs were designed to correct insertions, deletions, substitutions and a double substitution present in a collection of mutants at the endogenous locus of the dhfr gene, the product of which is the target of the chemotherapeutic agent methotrexate. We also describe an approach to use when the point mutation is far away from the homopyrimidine target domain. This strategy consists in designing Long-Distance- and Short-Distance-Repair-PPRHs where the PPRH core is bound to the repair tail by a five-thymidine linker. Surviving colonies in a DHFR selective medium, lacking glycine and sources of purines and thymidine, were analyzed by DNA sequencing, and by mRNA, protein and enzymatic measurements, confirming that all the dhfr mutants had been corrected. These results show that repair-PPRHs can be effective tools to accomplish a permanent correction of point mutations in the DNA sequence of mutant mammalian cells.


Subject(s)
Base Sequence , DNA Repair , Inverted Repeat Sequences , Point Mutation , Sequence Deletion , Tetrahydrofolate Dehydrogenase/genetics , Animals , CHO Cells , Cricetulus , Exons , Gene Expression , Genetic Loci , Genetic Therapy/methods , Introns , Mutagenesis, Insertional , Oligodeoxyribonucleotides/chemistry , Oligodeoxyribonucleotides/genetics , Tetrahydrofolate Dehydrogenase/metabolism , Transfection
5.
Biotechnol Bioeng ; 112(10): 2154-62, 2015 Oct.
Article in English | MEDLINE | ID: mdl-25943095

ABSTRACT

Mammalian cells are widely used for the production of therapeutic recombinant proteins, as these cells facilitate accurate folding and post-translational modifications often essential for optimum activity. Targeted insertion of a plasmid harboring a gene of interest into the genome of mammalian cells for the expression of a desired protein is a key step in production of such biologics. Here we show that a site specific double strand break (DSB) generated both in the genome and the donor plasmid using the CRISPR-Cas9 system can be efficiently used to target ∼5 kb plasmids into mammalian genomes via nonhomologous end joining (NHEJ). We were able to achieve efficiencies of up to 0.17% in HEK293 cells and 0.45% in CHO cells. This technique holds promise for quick and efficient insertion of a large foreign DNA sequence into a predetermined genomic site in mammalian cells.


Subject(s)
CRISPR-Cas Systems , Metabolic Engineering/methods , Plasmids , Recombination, Genetic , Animals , Cell Line , DNA End-Joining Repair , Humans , Mammals
6.
RNA ; 21(2): 213-29, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25492963

ABSTRACT

Pre-mRNA molecules in humans contain mostly short internal exons flanked by longer introns. To explain the removal of such introns, exon recognition instead of intron recognition has been proposed. We studied this exon definition using designer exons (DEs) made up of three prototype modules of our own design: an exonic splicing enhancer (ESE), an exonic splicing silencer (ESS), and a Reference Sequence (R) predicted to be neither. Each DE was examined as the central exon in a three-exon minigene. DEs made of R modules showed a sharp size dependence, with exons shorter than 14 nt and longer than 174 nt splicing poorly. Changing the strengths of the splice sites improved longer exon splicing but worsened shorter exon splicing, effectively displacing the curve to the right. For the ESE we found, unexpectedly, that its enhancement efficiency was independent of its position within the exon. For the ESS we found a step-wise positional increase in its effects; it was most effective at the 3' end of the exon. To apply these results quantitatively, we developed a biophysical model for exon definition of internal exons undergoing cotranscriptional splicing. This model features commitment to inclusion before the downstream exon is synthesized and competition between skipping and inclusion fates afterward. Collision of both exon ends to form an exon definition complex was incorporated to account for the effect of size; ESE/ESS effects were modeled on the basis of stabilization/destabilization. This model accurately predicted the outcome of independent experiments on more complex DEs that combined ESEs and ESSs.


Subject(s)
Exons , RNA Precursors/genetics , RNA Splicing , Base Sequence , HEK293 Cells , Humans , Models, Genetic , Plasmids/genetics , RNA Splice Sites , Regulatory Sequences, Ribonucleic Acid
7.
J Biotechnol ; 164(2): 346-53, 2012 Dec 15.
Article in English | MEDLINE | ID: mdl-23376841

ABSTRACT

Co-amplification of transgenes using the dihydrofolate reductase/methotrexate (DHFR/MTX) system is a widely used method for the isolation of Chinese hamster ovary (CHO) cell lines that secrete high levels of recombinant proteins. A bottleneck in this process is the stepwise selection for MTX resistant populations; which can be slow, tedious and erratic. We sought to speed up and regularize this process by isolating dhfr(-) CHO cell lines capable of integrating a transgene of interest into a defined chromosomal location that supports a high rate of gene amplification. We isolated 100 independent transfectants carrying a gene for human adenosine deaminase (ada) linked to a φC31 attP site and a portion of the dihydrofolate reductase (dhfr) gene. Measurement of the ada amplification rate in each transfectant using Luria-Delbruck fluctuation analysis revealed a wide clonal variation; sub-cloning showed these rates to be heritable. Site directed recombination was used to insert a transgene carrying a reporter gene for secreted embryonic alkaline phosphatase (SEAP) as well as the remainder of the dhfr gene into the attP site at this location in several of these clones. Subsequent selection for gene amplification of the reconstructed dhfr gene in a high ada amplification candidate clone (DG44-HA-4) yielded reproducible rates of seap gene amplification and concomitant increased levels of SEAP secretion. In contrast, random integrations of the dhfr gene into clone HA-4 did not yield these high levels of amplification. This cell line as well as this method of screening for high amplification rates may prove helpful for the reliable amplification of recombinant genes for therapeutically or diagnostically useful proteins.


Subject(s)
CHO Cells/physiology , Gene Amplification , Gene Dosage , Recombinant Proteins/genetics , Transfection/methods , Transgenes , Adenosine Deaminase/genetics , Adenosine Deaminase/metabolism , Alkaline Phosphatase/genetics , Animals , Biotechnology , Cloning, Molecular , Cricetinae , Cricetulus , Humans , Recombinant Proteins/metabolism , Tetrahydrofolate Dehydrogenase/genetics
8.
Genome Res ; 21(8): 1360-74, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21659425

ABSTRACT

We describe a comprehensive quantitative measure of the splicing impact of a complete set of RNA 6-mer sequences by deep sequencing successfully spliced transcripts. All 4096 6-mers were substituted at five positions within two different internal exons in a 3-exon minigene, and millions of successfully spliced transcripts were sequenced after transfection of human cells. The results allowed the assignment of a relative splicing strength score to each mutant molecule. The effect of 6-mers on splicing often depended on their location; much of this context effect could be ascribed to the creation of different overlapping sequences at each site. Taking these overlaps into account, the splicing effect of each 6-mer could be quantified, and 6-mers could be designated as enhancers (ESEseqs) and silencers (ESSseqs), with an ESRseq score indicating their strength. Some 6-mers exhibited positional bias relative to the two splice sites. The distribution and conservation of these ESRseqs in and around human exons supported their classification. Predicted RNA secondary structure effects were also seen: Effective enhancers, silencers and 3' splice sites tend to be single stranded, and effective 5' splice sites tend to be double stranded. 6-mers that may form positive or negative synergy with another were also identified. Chromatin structure may also influence the splicing enhancement observed, as a good correspondence was found between splicing performance and the predicted nucleosome occupancy scores of 6-mers. This approach may prove of general use in defining nucleic acid regulatory motifs, substitute for functional SELEX in most cases, and provide insights about splicing mechanisms.


Subject(s)
Exons/genetics , RNA Splicing/genetics , Chromatin/genetics , Humans , Nucleic Acid Conformation , RNA/genetics , RNA Splice Sites , Regulatory Sequences, Ribonucleic Acid
9.
RNA Biol ; 8(3): 384-8, 2011.
Article in English | MEDLINE | ID: mdl-21444999

ABSTRACT

Splicing is a crucial process in gene expression in higher organisms because: 1) most vertebrate genes contain introns; and 2) alternative splicing is primarily responsible for increasing proteomic complexity and functional diversity. Intron definition, the coordination across an intron, is a mandatory step in the splicing process. However, exon definition, the coordination across an exon, is also thought to be required for the splicing of most vertebrate exons. Recent investigations of exon definition complexes provide insights into splicing dynamics. That splicing regulators act in a context-dependent mode is supported by a large collection of evidence. Splicing contexts generally can be classified as cis-element and trans-element contexts. A widespread cis-element context is defined by co-occurring motif pairs to which splicing regulatory factors bind to direct specific molecular interactions. Splicing regulation is also coordinated by trans-element contexts as exemplified by tissue specific splicing, where alternative exons can be coordinately regulated by a few splicing factors, the expression and/or activity of which are concertedly higher or lower in the corresponding tissues.


Subject(s)
Alternative Splicing , Exons , Genome , Introns , Organ Specificity/genetics , Proteomics
10.
Genome Biol ; 11(8): R84, 2010.
Article in English | MEDLINE | ID: mdl-20704715

ABSTRACT

BACKGROUND: A very early step in splice site recognition is exon definition, a process that is as yet poorly understood. Communication between the two ends of an exon is thought to be required for this step. We report genome-wide evidence for exons being defined through the combinatorial activity of motifs located in flanking intronic regions. RESULTS: Strongly co-occurring motifs were found to specifically reside in four intronic regions surrounding a large number of human exons. These paired motifs occur around constitutive and alternative exons but not pseudo exons. Most co-occurring motifs are limited to intronic regions within 100 nucleotides of the exon. They are preferentially associated with weaker exons. Their pairing is conserved in evolution and they exhibit a lower frequency of single nucleotide polymorphism when paired. Paired motifs display specificity with respect to distance from the exon borders and in constitutive versus alternative splicing. Many resemble binding sites for heterogeneous nuclear ribonucleoproteins. Specific pairs are associated with tissue-specific genes, the higher expression of which coincides with that of the pertinent RNA binding proteins. Tested pairs acted synergistically to enhance exon inclusion, and this enhancement was found to be exon-specific. CONCLUSIONS: The exon-flanking sequence pairs identified here by genomic analysis promote exon inclusion and may play a role in the exon definition step in pre-mRNA splicing. We propose a model in which multiple concerted interactions are required between exonic sequences and flanking intronic sequences to effect exon definition.


Subject(s)
Computational Biology/methods , Exons/genetics , Introns/genetics , RNA Precursors/genetics , RNA Splicing , Algorithms , Base Sequence , Genome, Human , Genomics/methods , Humans , Models, Genetic
12.
Biotechnol Adv ; 28(6): 673-81, 2010.
Article in English | MEDLINE | ID: mdl-20416368

ABSTRACT

Demand is increasing for therapeutic biopharmaceuticals such as monoclonal antibodies. Achieving maximum production of these recombinant proteins under developmental time constraints has been a recent focus of study. The majority of these drugs are currently produced in altered Chinese hamster ovary (CHO) cells due to the high viability and the high densities achieved by these cells in suspension cultures. However, shortening the process of developing and isolating high-producing cell lines remains a challenge. This article focuses on current expression systems used to produce biopharmaceuticals in CHO cells and current methods being investigated to produce biopharmaceuticals more efficiently. The methods discussed include modified gene amplification methods, modifying vectors to improve expression of the therapeutic gene and improving the method of selecting for high-producing cells. Recent developments that use gene targeting as a method for increasing production are discussed.


Subject(s)
Genetic Engineering/methods , Genetic Vectors/genetics , Nucleic Acid Amplification Techniques/methods , Recombinant Proteins/biosynthesis , Recombinant Proteins/therapeutic use , Tetrahydrofolate Dehydrogenase/metabolism , Animals , Biopharmaceutics , CHO Cells , Cricetinae , Cricetulus , Gene Targeting
13.
RNA ; 15(3): 367-76, 2009 Mar.
Article in English | MEDLINE | ID: mdl-19155327

ABSTRACT

Pre-messengerRNA (mRNA) splicing requires the accurate recognition of splice sites by the cellular RNA processing machinery. In addition to sequences that comprise the branchpoint and the 3' and 5' splice sites, the cellular splicing machinery relies on additional information in the form of exonic and intronic splicing enhancer and silencer sequences. The high abundance of these motifs makes it difficult to investigate their effects using standard genetic perturbations, since their disruption often leads to the formation of yet new elements. To lessen this problem, we have designed synthetic exons comprised of multiple copies of a single prototypical exonic enhancer and a single prototypical exonic silencer sequence separated by neutral spacer sequences. The spacer sequences buffer the exon against the formation of new elements as the number and order of the original elements are varied. Over 100 such designer exons were constructed by random ligation of enhancer, silencer, and neutral elements. Each exon was positioned as the central exon in a 3-exon minigene and tested for exon inclusion after transient transfection. The level of inclusion of the test exons was seen to be dependent on the provision of enhancers and could be decreased by the provision of silencers. In general, there was a good quantitative correlation between the proportion of enhancers and splicing. However, widely varying inclusion levels could be produced by different permutations of the enhancer and silencer elements, indicating that even in this simplified system splicing decisions rest on complex interplays of yet to be determined parameters.


Subject(s)
Exons , RNA Splice Sites , RNA, Messenger/metabolism , Regulatory Sequences, Ribonucleic Acid , Animals , Cricetinae , Cricetulus , Humans , RNA Splicing
14.
Cell ; 135(7): 1224-36, 2008 Dec 26.
Article in English | MEDLINE | ID: mdl-19109894

ABSTRACT

Alternative splicing makes a major contribution to proteomic diversity in higher eukaryotes with approximately 70% of genes encoding two or more isoforms. In most cases, the molecular mechanisms responsible for splice site choice remain poorly understood. Here, we used a randomization-selection approach in vitro to identify sequence elements that could silence a proximal strong 5' splice site located downstream of a weakened 5' splice site. We recovered two exonic and four intronic motifs that effectively silenced the proximal 5' splice site both in vitro and in vivo. Surprisingly, silencing was only observed in the presence of the competing upstream 5' splice site. Biochemical evidence strongly suggests that the silencing motifs function by altering the U1 snRNP/5' splice site complex in a manner that impairs commitment to specific splice site pairing. The data indicate that perturbations of non-rate-limiting step(s) in splicing can lead to dramatic shifts in splice site choice.


Subject(s)
Alternative Splicing , Gene Expression Regulation , RNA Splice Sites , Exons , Genetic Techniques , HeLa Cells , Humans , Models, Biological
15.
Genome Res ; 18(4): 533-43, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18204002

ABSTRACT

We have used comparative genomics to characterize the evolutionary behavior of predicted splicing regulatory motifs. Using base substitution rates in intronic regions as a calibrator for neutral change, we found a strong avoidance of synonymous substitutions that disrupt predicted exonic splicing enhancers or create predicted exonic splicing silencers. These results attest to the functionality of the hexameric motif set used and suggest that they are subject to purifying selection. We also found that synonymous substitutions in constitutive exons tend to create exonic splicing enhancers and to disrupt exonic splicing silencers, implying positive selection for these splicing promoting events. We present evidence that this positive selection is the result of splicing-positive events compensating for splicing-negative events as well as for mutations that weaken splice-site sequences. Such compensatory events include nonsynonymous mutations, synonymous mutations, and mutations at splice sites. Compensation was also seen from the fact that orthologous exons tend to maintain the same number of predicted splicing motifs. Our data fit a splicing compensation model of exon evolution, in which selection for splicing-positive mutations takes place to counter the effect of an ongoing splicing-negative mutational process, with the exon as a whole being conserved as a unit of splicing. In the course of this analysis, we observed that synonymous positions in general are conserved relative to intronic sequences, suggesting that messenger RNA molecules are rich in sequence information for functions beyond protein coding and splicing.


Subject(s)
Evolution, Molecular , Mutation , RNA Splicing , Regulatory Sequences, Ribonucleic Acid , Selection, Genetic , Animals , Base Sequence , Exons , Genomics , Humans , Introns , Macaca , Molecular Sequence Data , Pan troglodytes/genetics , RNA Splice Sites
16.
Adv Exp Med Biol ; 623: 85-106, 2007.
Article in English | MEDLINE | ID: mdl-18380342

ABSTRACT

Intron removal during pre-mRNA splicing in higher eukaryotes requires the accurate identification of the two splice sites at the ends of the exons, or exon definition. The sequences constituting the splice sites provide insufficient information to distinguish true splice sites from the greater number of false splice sites that populate transcripts. Additional information used for exon recognition resides in a large number of positively or negatively acting elements that lie both within exons and in the adjacent introns. The identification of such sequence motifs has progressed rapidly in recent years, such that extensive lists are now available for exonic splicing enhancers and exonic splicing silencers. These motifs have been identified both by empirical experiments and by computational predictions, the validity of the latter being confirmed by experimental verification. Molecular searches have been carried out either by the selection of sequences that bind to splicing factors, or enhance or silence splicing in vitro or in vivo. Computational methods have focused on sequences of 6 or 8 nucleotides that are over- or under-represented in exons, compared to introns or transcripts that do not undergo splicing. These various methods have sought to provide global definitions of motifs, yet the motifs are distinctive to the method used for identification and display little overlap. Astonishingly, at least three-quarters of a typical mRNA would be comprised of these motifs. A present challenge lies in understanding how the cell integrates this surfeit of information to generate what is usually a binary splicing decision.


Subject(s)
RNA Splice Sites/genetics , RNA Splicing/genetics , Animals , Exons/genetics , Humans , Introns/genetics , RNA Splice Sites/physiology , Silencer Elements, Transcriptional
17.
Proc Natl Acad Sci U S A ; 103(36): 13427-32, 2006 Sep 05.
Article in English | MEDLINE | ID: mdl-16938881

ABSTRACT

Orthologous gene structures in eight vertebrate species were compared on a genomic scale to detect the birth and maturation of new internal exons during the course of evolution. We found that 40% of new human exons are alternatively spliced, and most of these are cassette exons (exons that are either included or skipped in their entirety) with low inclusion rates. This proportion decreases steadily as older and older exons are examined, even as splicing efficiency increases. Remarkably, the great majority of new cassette exons are composed of highly repeated sequences, especially Alu. Many new cassette exons are 5' untranslated exons; the proportion that code for protein increases steadily with age. New protein-coding exons evolve at a high rate, as evidenced by the initially high substitution rates (K(s) and K(a)), as well as the SNP density compared with older exons. This dynamic picture suggests that de novo recruitment rather than shuffling is the major route by which exons are added to genes, and that species-specific repeats could play a significant role in recent evolution.


Subject(s)
Evolution, Molecular , Exons , Genome , Vertebrates/genetics , Alternative Splicing , Alu Elements , Animals , Expressed Sequence Tags , Gene Expression Profiling , Humans , Kinetics , Polymorphism, Single Nucleotide , Species Specificity , Tandem Repeat Sequences , Tissue Distribution , Transcription, Genetic
18.
Methods ; 37(4): 292-305, 2005 Dec.
Article in English | MEDLINE | ID: mdl-16314258

ABSTRACT

The removal of introns from pre-mRNA requires as an initial event the accurate molecular recognition of the proper exon-intron borders. It is now evident that RNA sequence elements in addition to the consensus splice site sequences themselves are required for this recognition. Genomic analyses have contributed to the definition of these elements as exonic and intronic splicing enhancers and silencers, comprising what has been called the "splicing code." Many computational methods have been brought to bear in such studies. We describe here some of the methods we have used to discover functional splicing signals. What these methods have in common is a comparison of sequences in and around exons to sequences found elsewhere in the genome. We have especially made use of comparisons to "pseudo exons," intronic sequences resembling exons by virtue of being bounded by sequences indistinguishable from splice sites. Two computational strategies are emphasized: (1) the use of a machine learning technique in which a computational algorithm, a support vector machine, is first trained on known examples and then used to predict sequences associated with splicing; and (2) straight statistical analysis of differences between regions associated with exons and other regions in the genome. In most cases, the predictions made using these methods have been validated by subsequent empirical tests. An attempt has been made to make this description understandable by researchers unfamiliar with computational practice and to include practical references to specific databases and programs.


Subject(s)
Alternative Splicing/genetics , Computational Biology/methods , Artificial Intelligence , Computational Biology/statistics & numerical data , Exons , Genetic Code , Humans , Introns , Sequence Alignment , Software , Statistics as Topic
19.
Mol Cell Biol ; 25(16): 7323-32, 2005 Aug.
Article in English | MEDLINE | ID: mdl-16055740

ABSTRACT

We have previously formulated a list of approximately 2,000 RNA octamers as putative exonic splicing enhancers (PESEs) based on a statistical comparison of human exonic and nonexonic sequences (X. H. Zhang and L. A. Chasin, Genes Dev. 18:1241-1250, 2004). When inserted into a poorly spliced test exon, all eight tested octamers stimulated splicing, a result consistent with their identification as exonic splicing enhancers (ESEs). Here we present a much more stringent test of the validity of this list of PESEs. Twenty-two naturally occurring examples of nonoverlapping PESEs or PESE clusters were identified in six mammalian exons; five of the six exons tested are constitutively spliced. Each of the 22 individual PESEs or PESE clusters was disrupted by site-directed mutagenesis, usually by a single-base substitution. Eighteen of the 22 disruptions (82%) resulted in decreased splicing efficiency. In contrast, 24 control mutations had little or no effect on splicing. This high rate of success suggests that most PESEs function as ESEs in their natural context. Like most exons, these exons contain several PESEs. Since knocking out any one of several could produce a severalfold decrease in splicing efficiency, we conclude that there is little redundancy among ESEs in an exon and that they must work in concert to optimize splicing.


Subject(s)
Gene Expression Regulation , RNA Splicing , Alternative Splicing , Base Sequence , Cell Line , Dose-Response Relationship, Drug , Enhancer Elements, Genetic , Exons , Genetic Techniques , Humans , Models, Genetic , Molecular Sequence Data , Mutagenesis, Site-Directed , Mutation , Phenotype , Polymerase Chain Reaction , RNA/metabolism , RNA Precursors/metabolism , RNA, Messenger/metabolism
20.
Genome Res ; 15(6): 768-79, 2005 Jun.
Article in English | MEDLINE | ID: mdl-15930489

ABSTRACT

Intronic elements flanking the splice-site consensus sequences are thought to play a role in pre-mRNA splicing. However, the generality of this role, the catalog of effective sequences, and the mechanisms involved are still lacking. Using molecular genetic tests, we first showed that the approximately 50-nt intronic flanking sequences of exons beyond the splice-site consensus are generally important for splicing. We then went on to characterize exon flank sequences on a genomic scale. The G+C content of flanks displayed a bimodal distribution reflecting an exaggeration of this base composition in flanks relative to the gene as a whole. We divided all exons into two classes according to their flank G+C content and used computational and statistical methods to define pentamers of high relative abundance and phylogenetic conservation in exon flanks. Upstream pentamers were often common to the two classes, whereas downstream pentamers were totally different. Upstream and downstream pentamers were often identical around low G+C exons, and in contrast, were often complementary around high G+C exons. In agreement with this complementarity, predicted base pairing was more frequent between the flanks of high G+C exons. Pseudo exons did not exhibit this behavior, but rather tended to form base pairs between flanks and exon bodies. We conclude that most exons require signals in their immediate flanks for efficient splicing. G+C content is a sequence feature correlated with many genetic and genomic attributes. We speculate that there may be different mechanisms for splice site recognition depending on G+C content.


Subject(s)
Introns/genetics , RNA Precursors/genetics , RNA Splice Sites/genetics , RNA Splicing/genetics , Animals , Humans , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...