Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Plant Cell ; 26(2): 520-37, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24520154

ABSTRACT

Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning-based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive "noninformative" genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained "informative" genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing-based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress-related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes.


Subject(s)
Arabidopsis/genetics , Arabidopsis/physiology , Artificial Intelligence , Gene Expression Profiling , Gene Regulatory Networks , Stress, Physiological/genetics , Transcriptome/genetics , Databases, Genetic , Gene Expression Regulation, Plant , Genes, Plant , Genetic Association Studies , Phenotype , Signal Transduction/drug effects , Signal Transduction/genetics , Sodium Chloride/pharmacology , Software
2.
J Exp Bot ; 62(3): 1077-88, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21242318

ABSTRACT

From screening a population of Arabidopsis overexpression lines, two Arabidopsis genes were identified, EFO1 (early flowering by overexpression 1) and EFO2, that confer early flowering when overexpressed. The two genes encode putative WD-domain proteins which share high sequence similarity and constitute a small subfamily. Interestingly, the efo2-1 loss-of-function mutant also flowered earlier in short days and slightly earlier in long days than the wild type, while no flowering-time or morphological differences were observed in efo1-1 relative to the wild type. In addition, the efo2-1 mutation perturbed hypocotyl elongation, leaf expansion and formation, and stem elongation. EFO1 and EFO2 are both regulated by the circadian clock. Expression and genetic analyses revealed that EFO2 suppresses flowering largely through the action of CONSTANS (CO) and flowering locus T (FT), suggesting that EFO2 is a negative regulator of photoperiodic flowering. The growth defects in efo2-1 were augmented in efo1 efo2, but the induction of FT in the double mutant was comparable to that in efo2-1. Thus, while EFO2 acts as a floral repressor, EFO1 may not be directly involved in flowering, but the two genes do have overlapping roles in regulating other developmental processes. EFO1 and EFO2 may function collectively to serve as one of the converging points where the signals of growth and flowering intersect.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Flowers/growth & development , Gene Expression Regulation, Developmental , Amino Acid Sequence , Arabidopsis/chemistry , Arabidopsis/genetics , Arabidopsis/growth & development , Arabidopsis Proteins/chemistry , Arabidopsis Proteins/genetics , Circadian Clocks , Flowers/genetics , Flowers/physiology , Gene Expression Regulation, Plant , Molecular Sequence Data , Protein Structure, Tertiary , Sequence Alignment
3.
BMC Genomics ; 11: 308, 2010 May 16.
Article in English | MEDLINE | ID: mdl-20470436

ABSTRACT

BACKGROUND: The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS: Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS: Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.


Subject(s)
Codon/chemistry , Codon/genetics , Poaceae/genetics , Base Composition , DNA Methylation , Gene Conversion , Gene Expression Regulation, Plant , Genes, Plant/genetics , Genetic Variation , Genomics , Introns/genetics , Oryza/genetics , Sequence Homology, Nucleic Acid , Sorghum/genetics , TATA Box/genetics , Zea mays/genetics
4.
Plant Mol Biol ; 69(1-2): 179-94, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18937034

ABSTRACT

We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).


Subject(s)
DNA, Complementary/genetics , Genes, Plant , Zea mays/genetics , Alternative Splicing , Base Sequence , DNA Primers , Expressed Sequence Tags , Promoter Regions, Genetic , Transcription, Genetic
5.
Plant Cell ; 20(8): 2130-45, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18708477

ABSTRACT

Genes controlling hormone levels have been used to increase grain yields in wheat (Triticum aestivum) and rice (Oryza sativa). We created transgenic rice plants expressing maize (Zea mays), rice, or Arabidopsis thaliana genes encoding sterol C-22 hydroxylases that control brassinosteroid (BR) hormone levels using a promoter that is active in only the stems, leaves, and roots. The transgenic plants produced more tillers and more seed than wild-type plants. The seed were heavier as well, especially the seed at the bases of the spikes that fill the least. These phenotypic changes brought about 15 to 44% increases in grain yield per plant relative to wild-type plants in greenhouse and field trials. Expression of the Arabidopsis C-22 hydroxylase in the embryos or endosperms themselves had no apparent effect on seed weight. These results suggested that BRs stimulate the flow of assimilate from the source to the sink. Microarray and photosynthesis analysis of transgenic plants revealed evidence of enhanced CO(2) assimilation, enlarged glucose pools in the flag leaves, and increased assimilation of glucose to starch in the seed. These results further suggested that BRs stimulate the flow of assimilate. Plants have not been bred directly for seed filling traits, suggesting that genes that control seed filling could be used to further increase grain yield in crop plants.


Subject(s)
Oryza/metabolism , Plants, Genetically Modified/metabolism , Seeds/metabolism , Steroids, Heterocyclic/metabolism , Gene Expression Regulation, Plant , Oligonucleotide Array Sequence Analysis , Oryza/genetics , Oryza/growth & development , Photosynthesis/genetics , Photosynthesis/physiology , Plants, Genetically Modified/genetics , Plants, Genetically Modified/growth & development , Promoter Regions, Genetic , Reverse Transcriptase Polymerase Chain Reaction , Seeds/genetics , Seeds/growth & development , Signal Transduction/genetics , Signal Transduction/physiology
6.
Plant Mol Biol ; 60(1): 69-85, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16463100

ABSTRACT

Arabidopsis is currently the reference genome for higher plants. A new, more detailed statistical analysis of Arabidopsis gene structure is presented including intron and exon lengths, intergenic distances, features of promoters, and variant 5'-ends of mRNAs transcribed from the same transcription unit. We also provide a statistical characterization of Arabidopsis transcripts in terms of their size, UTR lengths, 3'-end cleavage sites, splicing variants, and coding potential. These analyses were facilitated by scrutiny of our collection of sequenced full-length cDNAs and much larger collection of 5'-ESTs, together with another set of full-length cDNAs from Salk/Stanford/Plant Gene Expression Center/RIKEN. Examples of alternative splicing are observed for transcripts from 7% of the genes and many of these genes display multiple spliced isoforms. Most splicing variants lie in non-coding regions of the transcripts. Non-canonical splice sites constitute less than 1% of all splice sites. Genes with fewer than four introns display reduced average mRNA levels. Putative alternative transcription start sites were observed in 30% of highly expressed genes and in more than 50% of the genes with low expression. Transcription start sites correlate remarkably well with a CG skew peak in the DNA sequences. The intergenic distances vary considerably, those where genes are transcribed towards one another being significantly shorter. New transcripts, missing in the current TIGR genome annotation and ESTs that are non-coding, including those antisense to known genes, are derived and cataloged in the Supplementary Material. They identify 148 new loci in the Arabidopsis genome. The conclusions drawn provide a better understanding of the Arabidopsis genome and how the gene transcripts are processed. The results also allow better predictions to be made for, as yet, poorly defined genes and provide a reference for comparisons with other plant genomes whose complete sequences are currently being determined. Some comparisons with rice are included in this paper.


Subject(s)
Arabidopsis/genetics , DNA, Complementary/genetics , Genes, Plant/genetics , Genome, Plant , Alternative Splicing , Base Sequence , DNA, Intergenic , DNA, Plant/genetics , Exons/genetics , Gene Expression Profiling , Gene Expression Regulation, Plant , Introns/genetics , Transcription Initiation Site
8.
Plant Physiol ; 138(4): 2033-47, 2005 Aug.
Article in English | MEDLINE | ID: mdl-16040657

ABSTRACT

CYP51 exists in all organisms that synthesize sterols de novo. Plant CYP51 encodes an obtusifoliol 14alpha-demethylase involved in the postsqualene sterol biosynthetic pathway. According to the current gene annotation, the Arabidopsis (Arabidopsis thaliana) genome contains two putative CYP51 genes, CYP51A1 and CYP51A2. Our studies revealed that CYP51A1 should be considered an expressed pseudogene. To study the functional importance of the CYP51A2 gene in plant growth and development, we isolated T-DNA knockout alleles for CYP51A2. Loss-of-function mutants for CYP51A2 showed multiple defects, such as stunted hypocotyls, short roots, reduced cell elongation, and seedling lethality. In contrast to other sterol mutants, such as fk/hydra2 and hydra1, the cyp51A2 mutant has only minor defects in early embryogenesis. Measurements of endogenous sterol levels in the cyp51A2 mutant revealed that it accumulates obtusifoliol, the substrate of CYP51, and a high proportion of 14alpha-methyl-delta8-sterols, at the expense of campesterol and sitosterol. The cyp51A2 mutants have defects in membrane integrity and hypocotyl elongation. The defect in hypocotyl elongation was not rescued by the exogenous application of brassinolide, although the brassinosteroid-signaling cascade is apparently not affected in the mutants. Developmental defects in the cyp51A2 mutant were completely rescued by the ectopic expression of CYP51A2. Taken together, our results demonstrate that the Arabidopsis CYP51A2 gene encodes a functional obtusifoliol 14alpha-demethylase enzyme and plays an essential role in controlling plant growth and development by a sterol-specific pathway.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/genetics , Arabidopsis/physiology , Cell Membrane/genetics , Cell Membrane/physiology , Cytochrome P-450 Enzyme System/metabolism , Oxidoreductases/metabolism , Seedlings/physiology , Arabidopsis Proteins/genetics , Cholestadienols/metabolism , Cytochrome P-450 Enzyme System/genetics , Gene Expression Regulation, Developmental , Gene Expression Regulation, Plant , Molecular Sequence Data , Mutation , Oxidoreductases/genetics , Phenotype , Phytosterols/metabolism , Seedlings/genetics , Sterol 14-Demethylase
9.
Phytochemistry ; 66(7): 771-80, 2005 Apr.
Article in English | MEDLINE | ID: mdl-15797603

ABSTRACT

To shed new light on gene involvement in plant cuticular-wax production, 11 eceriferum (cer) mutants of Arabidopsis having dramatic alterations in wax composition of inflorescence stems were used to create 14 double cer mutants each with two homozygous recessive cer loci. A comprehensive analysis of stem waxes on these double mutants revealed unexpected CER gene interactions and new ideas about individual CER gene functions. Five of the 14 double cer mutants produced significantly more total wax than one of their respective cer parents, indicating from a genetic standpoint a partial bypassing (or complementation) of one cer mutation by the other. Eight of the 14 double cer mutants had alkane amounts lower than both respective cer parents, suggesting that most of these CER gene products play a major additive role in alkane synthesis. Other results suggested that some CER genes function in more than one step of the wax pathway, including those associated with sequential steps in acyl-CoA elongation. Surprisingly, complete epistasis was not observed for any of the cer gene combinations tested. Significant overlap or redundancy of genetic operations thus appears to be a central feature of wax metabolism. Future studies of CER gene product function, as well as the utilization of CER genes for crop improvement, must now account for the complex gene interactions described here.


Subject(s)
Arabidopsis/chemistry , Arabidopsis/genetics , Plant Stems/chemistry , Waxes/chemistry , Flowers/physiology , Mutation , Plant Epidermis/chemistry
10.
Funct Integr Genomics ; 5(4): 240-53, 2005 Oct.
Article in English | MEDLINE | ID: mdl-15744539

ABSTRACT

Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5' and 3' flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5' upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.


Subject(s)
Arabidopsis/genetics , DNA, Bacterial/genetics , DNA, Plant/genetics , Promoter Regions, Genetic , Rhizobium/genetics , Base Sequence , DNA Primers , Plants, Genetically Modified
11.
Planta ; 219(1): 5-13, 2004 May.
Article in English | MEDLINE | ID: mdl-14758476

ABSTRACT

We conducted a novel non-visual screen for cuticular wax mutants in Arabidopsis thaliana (L.) Heynh. Using gas chromatography we screened over 1,200 ethyl methane sulfonate (EMS)-mutagenized lines for alterations in the major A. thaliana wild-type stem cuticular chemicals. Five lines showed distinct differences from the wild type and were further analyzed by gas chromatography and scanning electron microscopy. The five mutants were mapped to specific chromosome locations and tested for allelism with other wax mutant loci mapping to the same region. Toward this end, the mapping of the cuticular wax ( cer) mutants cer10 to cer20 was conducted to allow more efficient allelism tests with newly identified lines. From these five lines, we have identified three mutants defining novel genes that have been designated CER22, CER23, and CER24. Detailed stem and leaf chemistry has allowed us to place these novel mutants in specific steps of the cuticular wax biosynthetic pathway and to make hypotheses about the function of their gene products.


Subject(s)
Arabidopsis/genetics , Mutation , Arabidopsis/chemistry , Arabidopsis Proteins/genetics , Chromatography, Gas , Plant Leaves/chemistry , Plant Stems/chemistry , Plant Stems/ultrastructure
12.
Proc Natl Acad Sci U S A ; 100(14): 8571-6, 2003 Jul 08.
Article in English | MEDLINE | ID: mdl-12826617

ABSTRACT

The UNUSUAL FLORAL ORGANS (UFO) gene is required for multiple processes in the developing Arabidopsis flower, including the proper patterning and identity of both petals and stamens. The gene encodes an F-box-containing protein, UFO, which interacts physically and genetically with the Skp1 homolog, ASK1. In this report, we describe four ufo alleles characterized by the absence of petals, which uncover another role for UFO in promoting second whorl development. This UFO-dependent pathway is required regardless of the second whorl organ to be formed, arguing that it affects a basic process acting in parallel with those establishing organ identity. However, the pathway is dispensable in the absence of AGAMOUS (AG), a known inhibitor of petal development. In situ hybridization results argue that AG is not transcribed in the petal region, suggesting that it acts non-cell-autonomously to inhibit second whorl development in ufo mutants. These results are combined into a genetic model explaining early second whorl initiation/proliferation, in which UFO functions to inhibit an AG-dependent activity.


Subject(s)
AGAMOUS Protein, Arabidopsis/physiology , Arabidopsis Proteins/physiology , Arabidopsis/growth & development , Flowers/growth & development , Transcription Factors/physiology , Alleles , Arabidopsis/genetics , Arabidopsis Proteins/chemistry , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Morphogenesis/genetics , Protein Interaction Mapping , Recombinant Fusion Proteins/metabolism , Saccharomyces cerevisiae/genetics , Sequence Deletion , Transcription Factors/chemistry , Transcription Factors/genetics , Transcription, Genetic , Two-Hybrid System Techniques
13.
Plant Physiol ; 130(3): 1506-15, 2002 Nov.
Article in English | MEDLINE | ID: mdl-12428015

ABSTRACT

Mutants defective in the biosynthesis or signaling of brassinosteroids (BRs), plant steroid hormones, display dwarfism. Loss-of-function mutants for the gene encoding the plasma membrane-located BR receptor BRI1 are resistant to exogenous application of BRs, and characterization of this protein has contributed significantly to the understanding of BR signaling. We have isolated two new BR-insensitive mutants (dwarf12-1D and dwf12-2D) after screening Arabidopsis ethyl methanesulfonate mutant populations. dwf12 mutants displayed the characteristic morphology of previously reported BR dwarfs including short stature, short round leaves, infertility, and abnormal de-etiolation. In addition, dwf12 mutants exhibited several unique phenotypes, including severe downward curling of the leaves. Genetic analysis indicates that the two mutations are semidominant in that heterozygous plants show a semidwarf phenotype whose height is intermediate between wild-type and homozygous mutant plants. Unlike BR biosynthetic mutants, dwf12 plants were not rescued by high doses of exogenously applied BRs. Like bri1 mutants, dwf12 plants accumulated castasterone and brassinolide, 43- and 15-fold higher, respectively, providing further evidence that DWF12 is a component of the BR signaling pathway that includes BRI1. Map-based cloning of the DWF12 gene revealed that DWF12 belongs to a member of the glycogen synthase kinase 3beta family. Unlike human glycogen synthase kinase 3beta, DWF12 lacks the conserved serine-9 residue in the auto-inhibitory N terminus. In addition, dwf12-1D and dwf12-2D encode changes in consecutive glutamate residues in a highly conserved TREE domain. Together with previous reports that both bin2 and ucu1 mutants contain mutations in this TREE domain, this provides evidence that the TREE domain is of critical importance for proper function of DWF12/BIN2/UCU1 in BR signal transduction pathways.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/enzymology , Genes, Dominant/genetics , Glycogen Synthase Kinase 3/genetics , Glycogen Synthase Kinase 3/metabolism , Amino Acid Sequence , Arabidopsis/genetics , Arabidopsis/growth & development , Arabidopsis Proteins/metabolism , Brassinosteroids , Cholestanols/chemistry , Cholestanols/metabolism , Glycogen Synthase Kinase 3 beta , Molecular Sequence Data , Mutation , Phenotype , Sequence Homology, Amino Acid , Steroids, Heterocyclic/chemistry , Steroids, Heterocyclic/metabolism
14.
Plant J ; 31(1): 1-12, 2002 Jul.
Article in English | MEDLINE | ID: mdl-12100478

ABSTRACT

Recent studies on jasmonic acid (JA) biosynthetic mutants have shown that jasmonates play essential roles in pollen maturation and dehiscence and wound-induced defence against biotic attacks. To better understand the biosynthetic mechanisms of this essential plant hormone, we isolated an Arabidopsis knock-out mutant defective in the JA biosynthetic gene CYP74A (allene oxide synthase, AOS) using reverse genetics screening methods. This enzyme catalyses dehydration of the hydroperoxide to an unstable allene oxide in the JA biosynthetic pathway. Endogenous JA levels, which increase 100-fold 1 h after wounding in wild-type plants, do not increase after wounding in the aos mutant. In addition, the mutant showed severe male sterility due to defects in anther and pollen development. The male-sterile phenotype was completely rescued by exogenous application of methyl jasomonate and by complementation with constitutive expression of the AOS gene. RT-PCR analysis showed that the induction of transcripts for vegetative storage protein and lipoxygenase genes, previously shown to be inducible by wound and jasmonate application in the wild-type, was absent in the aos mutant. In transgenic plants constitutively expressing AOS, wound-induced JA levels were 50-100% higher compared to wild-type plants. Taken together with JA deficiency in the aos mutant, our results show that AOS is critical for the biosynthesis of all biologically active jasmonates. Our results also suggest that AOS expression is limiting JA levels in wounded plants, but that the AOS hydroperoxide substrate levels, controlled by upstream enzymes (lipoxygenase and phospholipase), determine JA levels in unwounded plants.


Subject(s)
Arabidopsis/genetics , Arabidopsis/physiology , Cyclopentanes/metabolism , Intramolecular Oxidoreductases/genetics , Base Sequence , DNA, Plant/genetics , Gene Expression , Gene Targeting , Genes, Plant , Genetic Complementation Test , Intramolecular Oxidoreductases/physiology , Mutation , Oxylipins , Phenotype , Plants, Genetically Modified , Reproduction , Signal Transduction
15.
Genome Biol ; 3(6): RESEARCH0029, 2002.
Article in English | MEDLINE | ID: mdl-12093376

ABSTRACT

BACKGROUND: Annotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of genome sequence data now available, methods for accurate identification of large numbers of genes have become urgently needed. In an effort to create a set of very high-quality gene models, we used the sequence of 5,000 full-length gene transcripts from Arabidopsis to re-annotate its genome. We have mapped these transcripts to their exact chromosomal locations and, using alignment programs, have created gene models that provide a reference set for this organism. RESULTS: Approximately 35% of the transcripts indicated that previously annotated genes needed modification, and 5% of the transcripts represented newly discovered genes. We also discovered that multiple transcription initiation sites appear to be much more common than previously known, and we report numerous cases of alternative mRNA splicing. We include a comparison of different alignment software and an analysis of how the transcript data improved the previously published annotation. CONCLUSIONS: Our results demonstrate that sequencing of large numbers of full-length transcripts followed by computational mapping greatly improves identification of the complete exon structures of eukaryotic genes. In addition, we are able to find numerous introns in the untranslated regions of the genes.


Subject(s)
Arabidopsis/genetics , Genome, Plant , RNA, Messenger/genetics , Alternative Splicing/genetics , Computational Biology , Databases, Genetic , Exons/genetics , Genes, Plant/genetics , RNA Splicing/genetics , RNA, Messenger/classification , RNA, Plant/classification , RNA, Plant/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...