Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
PLoS One ; 6(6): e21524, 2011.
Article in English | MEDLINE | ID: mdl-21738689

ABSTRACT

Transcriptional regulation is an important mechanism underlying gene expression and has played a crucial role in evolution. The number, position and interactions between cis-elements and transcription factors (TFs) determine the expression pattern of a gene. To identify functionally relevant cis-elements in gene promoters, a phylogenetic shadowing approach with a lipase gene (LIP1) was used. As a proof of concept, in silico analyses of several Brassicaceae LIP1 promoters identified a highly conserved sequence (LIP1 element) that is sufficient to drive strong expression of a reporter gene in planta. A collection of ca. 1,200 Arabidopsis thaliana TF open reading frames (ORFs) was arrayed in a 96-well format (RR library) and a convenient mating based yeast one hybrid (Y1H) screening procedure was established. We constructed an episomal plasmid (pTUY1H) to clone the LIP1 element and used it as bait for Y1H screenings. A novel interaction with an HD-ZIP (AtML1) TF was identified and abolished by a 2 bp mutation in the LIP1 element. A role of this interaction in transcriptional regulation was confirmed in planta. In addition, we validated our strategy by reproducing the previously reported interaction between a MYB-CC (PHR1) TF, a central regulator of phosphate starvation responses, with a conserved promoter fragment (IPS1 element) containing its cognate binding sequence. Finally, we established that the LIP1 and IPS1 elements were differentially bound by HD-ZIP and MYB-CC family members in agreement with their genetic redundancy in planta. In conclusion, combining in silico analyses of orthologous gene promoters with Y1H screening of the RR library represents a powerful approach to decipher cis- and trans-regulatory codes.


Subject(s)
Arabidopsis Proteins/metabolism , Transcription Factors/metabolism , Arabidopsis Proteins/classification , Arabidopsis Proteins/genetics , Brassicaceae/genetics , Gene Expression Regulation, Plant/genetics , Gene Expression Regulation, Plant/physiology , Open Reading Frames/genetics , Phylogeny , Plants, Genetically Modified/genetics , Plants, Genetically Modified/metabolism , Promoter Regions, Genetic/genetics , Sulfurtransferases , Transcription Factors/classification , Transcription Factors/genetics , Two-Hybrid System Techniques
2.
Plant Methods ; 7: 8, 2011 Mar 29.
Article in English | MEDLINE | ID: mdl-21447150

ABSTRACT

BACKGROUND: In the contexts of genomics, post-genomics and systems biology approaches, data integration presents a major concern. Databases provide crucial solutions: they store, organize and allow information to be queried, they enhance the visibility of newly produced data by comparing them with previously published results, and facilitate the exploration and development of both existing hypotheses and new ideas. RESULTS: The FLAGdb++ information system was developed with the aim of using whole plant genomes as physical references in order to gather and merge available genomic data from in silico or experimental approaches. Available through a JAVA application, original interfaces and tools assist the functional study of plant genes by considering them in their specific context: chromosome, gene family, orthology group, co-expression cluster and functional network. FLAGdb++ is mainly dedicated to the exploration of large gene groups in order to decipher functional connections, to highlight shared or specific structural or functional features, and to facilitate translational tasks between plant species (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa and Vitis vinifera). CONCLUSION: Combining original data with the output of experts and graphical displays that differ from classical plant genome browsers, FLAGdb++ presents a powerful complementary tool for exploring plant genomes and exploiting structural and functional resources, without the need for computer programming knowledge. First launched in 2002, a 15th version of FLAGdb++ is now available and comprises four model plant genomes and over eight million genomic features.

3.
Genome ; 53(9): 739-52, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20924423

ABSTRACT

Many transcription factor binding sites (TFBSs) involved in gene expression regulation are preferentially located relative to the transcription start site. This property is exploited in in silico prediction approaches, one of which involves studying the local overrepresentation of motifs using a sliding window to scan promoters with considerable accuracy. Nevertheless, the consequences of the choice of the sliding window size have never before been analysed. We propose an automatic adaptation of this size to each motif distribution profile. This approach allows a better characterization of the topological constraints of the motifs and the lists of genes containing them. Moreover, our approach allowed us to highlight a nonconstant frequency of occurrence of spurious motifs that could be counter-selected close to their functional area. Therefore, to improve the accuracy of in silico prediction of TFBSs and the sensitivity of the promoter cartography, we propose, in addition to automatic adaptation of window size, consideration of the nonconstant frequency of motifs in promoters.


Subject(s)
Arabidopsis/genetics , Promoter Regions, Genetic , Repetitive Sequences, Nucleic Acid , Transcription Factors/chemistry , Amino Acid Motifs , Base Sequence/genetics , Binding Sites/genetics , DNA, Plant/genetics , Gene Expression Regulation, Plant , Genes, Plant , Genome, Plant , Protein Structure, Tertiary , Regulatory Sequences, Nucleic Acid , Sequence Analysis, DNA , Transcription Initiation Site
4.
BMC Genomics ; 11: 166, 2010 Mar 12.
Article in English | MEDLINE | ID: mdl-20222994

ABSTRACT

BACKGROUND: The TATA-box and TATA-variants are regulatory elements involved in the formation of a transcription initiation complex. Both have been conserved throughout evolution in a restricted region close to the Transcription Start Site (TSS). However, less than half of the genes in model organisms studied so far have been found to contain either one of these elements. Indeed different core-promoter elements are involved in the recruitment of the TATA-box-binding protein. Here we assessed the possibility of identifying novel functional motifs in plant genes, sharing the TATA-box topological constraints. RESULTS: We developed an ab-initio approach considering the preferential location of motifs relative to the TSS. We identified motifs observed at the TATA-box expected location and conserved in both Arabidopsis thaliana and Oryza sativa promoters. We identified TC-elements within non-TA-rich promoters 30 bases upstream of the TSS. As with the TATA-box and TATA-variant sequences, it was possible to construct a unique distance graph with the TC-element sequences. The structural and functional features of TC-element-containing genes were distinct from those of TATA-box- or TATA-variant-containing genes. Arabidopsis thaliana transcriptome analysis revealed that TATA-box-containing genes were generally those showing relatively high levels of expression and that TC-element-containing genes were generally those expressed in specific conditions. CONCLUSIONS: Our observations suggest that the TC-elements might constitute a class of novel regulatory elements participating towards the complex modulation of gene expression in plants.


Subject(s)
Arabidopsis/genetics , Promoter Regions, Genetic , TATA Box , Transcription Initiation Site , Conserved Sequence , DNA, Plant/genetics , Gene Expression Regulation, Plant , Genes, Plant , Genome, Plant , Oryza/genetics , Sequence Analysis, DNA
5.
BMC Evol Biol ; 8: 291, 2008 Oct 24.
Article in English | MEDLINE | ID: mdl-18950478

ABSTRACT

BACKGROUND: The Wuschel related homeobox (WOX) family proteins are key regulators implicated in the determination of cell fate in plants by preventing cell differentiation. A recent WOX phylogeny, based on WOX homeodomains, showed that all of the Physcomitrella patens and Selaginella moellendorffii WOX proteins clustered into a single orthologous group. We hypothesized that members of this group might preferentially share a significant part of their function in phylogenetically distant organisms. Hence, we first validated the limits of the WOX13 orthologous group (WOX13 OG) using the occurrence of other clade specific signatures and conserved intron insertion sites. Secondly, a functional analysis using expression data and mutants was undertaken. RESULTS: The WOX13 OG contained the most conserved plant WOX proteins including the only WOX detected in the highly proliferating basal unicellular and photosynthetic organism Ostreococcus tauri. A large expansion of the WOX family was observed after the separation of mosses from other land plants and before monocots and dicots have arisen. In Arabidopsis thaliana, AtWOX13 was dynamically expressed during primary and lateral root initiation and development, in gynoecium and during embryo development. AtWOX13 appeared to affect the floral transition. An intriguing clade, represented by the functional AtWOX14 gene inside the WOX13 OG, was only found in the Brassicaceae. Compared to AtWOX13, the gene expression profile of AtWOX14 was restricted to the early stages of lateral root formation and specific to developing anthers. A mutational insertion upstream of the AtWOX14 homeodomain sequence led to abnormal root development, a delay in the floral transition and premature anther differentiation. CONCLUSION: Our data provide evidence in favor of the WOX13 OG as the clade containing the most conserved WOX genes and established a functional link to organ initiation and development in Arabidopsis, most likely by preventing premature differentiation. The future use of Ostreococcus tauri and Physcomitrella patens as biological models should allow us to obtain a better insight into the functional importance of WOX13 OG genes.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/growth & development , Arabidopsis/genetics , Conserved Sequence , Flowers/growth & development , Homeodomain Proteins/genetics , Plant Roots/growth & development , Amino Acid Motifs , Amino Acid Sequence , Arabidopsis/metabolism , Arabidopsis Proteins/chemistry , Eukaryota/genetics , Eukaryota/physiology , Flowers/metabolism , Gene Expression Profiling , Gene Expression Regulation, Plant , Gene Knockout Techniques , Gene Order , Genome, Plant/genetics , Homeodomain Proteins/chemistry , Molecular Sequence Data , Mutation , Phylogeny , Plant Roots/metabolism , Plants/classification , Plants/genetics , Promoter Regions, Genetic/genetics
6.
BMC Evol Biol ; 8: 280, 2008 Oct 10.
Article in English | MEDLINE | ID: mdl-18847470

ABSTRACT

BACKGROUND: Plant genomes contain a high proportion of duplicated genes as a result of numerous whole, segmental and local duplications. These duplications lead up to the formation of gene families, which are the usual material for many evolutionary studies. However, all characterized genomes include single-copy (unique) genes that have not received much attention. Unlike gene duplication, gene loss is not an unspecific mechanism but is rather influenced by a functional selection. In this context, we have established and used stringent criteria in order to identify suitable sets of unique genes present in plant proteomes. Comparisons of unique genes in the green phylum were used to characterize the gene and protein features exhibited by both conserved and species-specific unique genes. RESULTS: We identified the unique genes within both A. thaliana and O. sativa genomes and classified them according to the number of homologs in the alternative species: none (U{1:0}), one (U{1:1}) or several (U{1:m}). Regardless of the species, all the genes in these groups present some conserved characteristics, such as small average protein size and abnormal intron number. In order to understand the origin and function of unique genes, we further characterized the U{1:1} gene pairs. The possible involvement of sequence convergence in the creation of U{1:1} pairs was discarded due to the frequent conservation of intron positions. Furthermore, an orthology relationship between the two members of each U{1:1} pair was strongly supported by a high conservation in the protein sizes and transcription levels. Within the promoter of the unique conserved genes, we found a number of TATA and TELO boxes that specifically differed from their mean number in the whole genome. Many unique genes have been conserved as unique through evolution from the green alga Ostreococcus lucimarinus to higher plants. Plant unique genes may also have homologs in bacteria and we showed a link between the targeting towards plastids of proteins encoded by plant nuclear unique genes and their homology with a bacterial protein. CONCLUSION: Many of the A. thaliana and O. sativa unique genes are conserved in plants for which the ancestor diverged at least 725 million years ago (MYA). Half of these genes are also present in other eukaryotic and/or prokaryotic species. Thus, our results indicate that (i) a strong negative selection pressure has conserved a number of genes as unique in genomes throughout evolution, (ii) most unique genes are subjected to a low divergence rate, (iii) they have some features observed in housekeeping genes but for most of them there is no functional annotation and (iv) they may have an ancient origin involving a possible gene transfer from ancestral chloroplasts or bacteria to the plant nucleus.


Subject(s)
Arabidopsis/genetics , Evolution, Molecular , Genes, Plant , Oryza/genetics , Amino Acid Sequence , Conserved Sequence , Gene Duplication , Genome, Plant , Introns , Phylogeny , Promoter Regions, Genetic , Proteome/genetics , Selection, Genetic , Sequence Analysis, Protein , Structure-Activity Relationship , Transcription, Genetic
7.
Nucleic Acids Res ; 36(Database issue): D986-90, 2008 Jan.
Article in English | MEDLINE | ID: mdl-17940091

ABSTRACT

CATdb is a free resource available at http://urgv.evry.inra.fr/CATdb that provides public access to a large collection of transcriptome data for Arabidopsis thaliana produced by a single Complete Arabidopsis Transcriptome Micro Array (CATMA) platform. CATMA probes consist of gene-specific sequence tags (GSTs) of 150-500 bp. The v2 version of CATMA contains 24 576 GST probes representing most of the predicted A. thaliana genes, and 615 probes tiling the chloroplastic and mitochondrial genomes. Data in CATdb are entirely processed with the same standardized protocol, from microarray printing to data analyses. CATdb contains the results of 53 projects including 1724 hybridized samples distributed between 13 different organs, 49 different developmental conditions, 45 mutants and 63 environmental conditions. All the data contained in CATdb can be downloaded from the web site and subsets of data can be sorted out and displayed either by keywords, by experiments, genes or lists of genes up to 100. CATdb gives an easy access to the complete description of experiments with a picture of the experiment design.


Subject(s)
Arabidopsis/genetics , Databases, Genetic , Gene Expression Profiling , Arabidopsis/metabolism , Internet , Oligonucleotide Array Sequence Analysis , Oligonucleotide Probes/chemistry , RNA, Messenger/analysis , Sequence Tagged Sites , User-Computer Interface
8.
BMC Genomics ; 8: 401, 2007 Nov 02.
Article in English | MEDLINE | ID: mdl-17980019

ABSTRACT

BACKGROUND: Since the finishing of the sequencing of the Arabidopsis thaliana genome, the Arabidopsis community and the annotator centers have been working on the improvement of gene annotation at the structural and functional levels. In this context, we have used the large CATMA resource on the Arabidopsis transcriptome to search for genes missed by different annotation processes. Probes on the CATMA microarrays are specific gene sequence tags (GSTs) based on the CDS models predicted by the Eugene software. Among the 24 576 CATMA v2 GSTs, 677 are in regions considered as intergenic by the TAIR annotation. We analyzed the cognate transcriptome data in the CATMA resource and carried out data-mining to characterize novel genes and improve gene models. RESULTS: The statistical analysis of the results of more than 500 hybridized samples distributed among 12 organs provides an experimental validation for 465 novel genes. The hybridization evidence was confirmed by RT-PCR approaches for 88% of the 465 novel genes. Comparisons with the current annotation show that these novel genes often encode small proteins, with an average size of 137 aa. Our approach has also led to the improvement of pre-existing gene models through both the extension of 16 CDS and the identification of 13 gene models erroneously constituted of two merged CDS. CONCLUSION: This work is a noticeable step forward in the improvement of the Arabidopsis genome annotation. We increased the number of Arabidopsis validated genes by 465 novel transcribed genes to which we associated several functional annotations such as expression profiles, sequence conservation in plants, cognate transcripts and protein motifs.


Subject(s)
Arabidopsis/genetics , Data Interpretation, Statistical , Databases, Genetic , Gene Expression Profiling , Genes, Plant , Models, Genetic , Models, Biological
9.
Nature ; 449(7161): 463-7, 2007 Sep 27.
Article in English | MEDLINE | ID: mdl-17721507

ABSTRACT

The analysis of the first plant genomes provided unexpected evidence for genome duplication events in species that had previously been considered as true diploids on the basis of their genetics. These polyploidization events may have had important consequences in plant evolution, in particular for species radiation and adaptation and for the modulation of functional capacities. Here we report a high-quality draft of the genome sequence of grapevine (Vitis vinifera) obtained from a highly homozygous genotype. The draft sequence of the grapevine genome is the fourth one produced so far for flowering plants, the second for a woody species and the first for a fruit crop (cultivated for both fruit and beverage). Grapevine was selected because of its important place in the cultural heritage of humanity beginning during the Neolithic period. Several large expansions of gene families with roles in aromatic features are observed. The grapevine genome has not undergone recent genome duplication, thus enabling the discovery of ancestral traits and features of the genetic organization of flowering plants. This analysis reveals the contribution of three ancestral genomes to the grapevine haploid content. This ancestral arrangement is common to many dicotyledonous plants but is absent from the genome of rice, which is a monocotyledon. Furthermore, we explain the chronology of previously described whole-genome duplication events in the evolution of flowering plants.


Subject(s)
Evolution, Molecular , Genome, Plant/genetics , Polyploidy , Vitis/classification , Vitis/genetics , Arabidopsis/genetics , DNA, Intergenic/genetics , Exons/genetics , Genes, Plant/genetics , Introns/genetics , Karyotyping , MicroRNAs/genetics , Molecular Sequence Data , Oryza/genetics , Populus/genetics , RNA, Plant/genetics , RNA, Transfer/genetics , Sequence Analysis, DNA
10.
Plant Physiol ; 141(3): 825-39, 2006 Jul.
Article in English | MEDLINE | ID: mdl-16825340

ABSTRACT

In Arabidopsis (Arabidopsis thaliana) the 466 pentatricopeptide repeat (PPR) proteins are putative RNA-binding proteins with essential roles in organelles. Roughly half of the PPR proteins form the plant combinatorial and modular protein (PCMP) subfamily, which is land-plant specific. PCMPs exhibit a large and variable tandem repeat of a standard pattern of three PPR variant motifs. The association or not of this repeat with three non-PPR motifs at their C terminus defines four distinct classes of PCMPs. The highly structured arrangement of these motifs and the similar repartition of these arrangements in the four classes suggest precise relationships between motif organization and substrate specificity. This study is an attempt to reconstruct an evolutionary scenario of the PCMP family. We developed an innovative approach based on comparisons of the proteins at two levels: namely the succession of motifs along the protein and the amino acid sequence of the motifs. It enabled us to infer evolutionary relationships between proteins as well as between the inter- and intraprotein repeats. First, we observed a polarized elongation of the repeat from the C terminus toward the N-terminal region, suggesting local recombinations of motifs. Second, the most N-terminal PPR triple motif proved to evolve under different constraints than the remaining repeat. Altogether, the evidence indicates different evolution for the PPR region and the C-terminal one in PCMPs, which points to distinct functions for these regions. Moreover, local sequence homogeneity observed across PCMP classes may be due to interclass shuffling of motifs, or to deletions/insertions of non-PPR motifs at the C terminus.


Subject(s)
Arabidopsis/genetics , Evolution, Molecular , Multigene Family , RNA-Binding Proteins/genetics , Genes, Plant , Sequence Homology, Amino Acid
11.
Nucleic Acids Res ; 33(Database issue): D641-6, 2005 Jan 01.
Article in English | MEDLINE | ID: mdl-15608279

ABSTRACT

Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Databases, Genetic , Genes, Plant , Arabidopsis Proteins/chemistry , Arabidopsis Proteins/physiology , Philosophy , Systems Integration , User-Computer Interface
12.
Plant Cell ; 16(8): 2089-103, 2004 Aug.
Article in English | MEDLINE | ID: mdl-15269332

ABSTRACT

The complete sequence of the Arabidopsis thaliana genome revealed thousands of previously unsuspected genes, many of which cannot be ascribed even putative functions. One of the largest and most enigmatic gene families discovered in this way is characterized by tandem arrays of pentatricopeptide repeats (PPRs). We describe a detailed bioinformatic analysis of 441 members of the Arabidopsis PPR family plus genomic and genetic data on the expression (microarray data), localization (green fluorescent protein and red fluorescent protein fusions), and general function (insertion mutants and RNA binding assays) of many family members. The basic picture that arises from these studies is that PPR proteins play constitutive, often essential roles in mitochondria and chloroplasts, probably via binding to organellar transcripts. These results confirm, but massively extend, the very sparse observations previously obtained from detailed characterization of individual mutants in other organisms.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Genome, Plant , Organelles/physiology , Tandem Repeat Sequences , Amino Acid Motifs , Animals , Arabidopsis/cytology , Arabidopsis/metabolism , Computational Biology , DNA, Bacterial/genetics , Evolution, Molecular , Gene Expression Profiling , Gene Expression Regulation, Plant , Humans , Molecular Sequence Data , Multigene Family , Oligonucleotide Array Sequence Analysis , Phylogeny , Protein Structure, Tertiary , Recombinant Fusion Proteins/genetics , Recombinant Fusion Proteins/metabolism , Sequence Analysis, DNA , Sequence Homology, Amino Acid
13.
Nucleic Acids Res ; 32(Database issue): D347-50, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14681431

ABSTRACT

FLAGdb++ is dedicated to the integration and visualization of data for high-throughput functional analysis of a fully sequenced genome, as illustrated for Arabidopsis. FLAGdb++ displays the predicted or experimental data in a position-dependent way and displays correlations and relationships between different features. FLAGdb++ provides for a given genome region, summarized characteristics of experimental materials like probe lengths, locations and specificities having an impact upon the confidence we will put in the experimental results. A selected subset of the available information is linked to a locus represented on an easy-to-interpret and memorable graphical display. Data are curated, processed and formatted before their integration into FLAGdb++. FLAGdb++ contains different options for easy back and forth navigation through many loci selected at the start of a session. It includes an original two-component visualization of the data, a genome-wide and a local view, which are permanently linked and display complementary information. Density curves along the chromosomes may be displayed in parallel for suggesting correlations between different structural and functional data. FLAGdb++ is fully accessible at http://genoplante-info.infobiogen.fr/FLAGdb/.


Subject(s)
Arabidopsis/genetics , Databases, Genetic , Genome, Plant , Chromosomes, Plant/genetics , Computational Biology , Genomics , Information Storage and Retrieval , Internet , User-Computer Interface
14.
Plant Biotechnol J ; 2(5): 401-15, 2004 Sep.
Article in English | MEDLINE | ID: mdl-17168887

ABSTRACT

The model genome of Arabidopsis thaliana contains a DEAD-box RNA helicase family (RH) of 58 members, i.e. almost twice as many as in the animal or yeast genomes. Transcript profiling using real-time quantitative polymerase chain reaction (PCR) has been obtained for 20 AtRHs from nine different organs. Two AtRHs exhibited plant-specific profiles associated with photosynthetic and sink organs. The other 18 AtRHs had the same transcript profile, and the levels of transcription of these 'housekeeping'AtRHs were under strict quantitative control over a large range of values. Transcript levels may be very different between the most recently duplicated genes. The master regulatory element in the definition of the transcript level is the simultaneous presence of a TATA-box and an intron in the 5' untranslated region (UTR). There is a positive and highly significant correlation between the size of the 5' UTR intron and the transcription level, as long as a characteristic TATA-box is present. Our work on the housekeeping AtRHs suggests a scenario for the evolution of duplicated genes, leading to both highly and poorly transcribed genes in the same terminal branch of the phylogenetic tree. The general evolutionary drive of the AtRH family, after duplication of a highly transcribed ancestral AtRH, was towards an alteration of the transcriptional activity of the divergent duplicates through successive events of suppression of the TATA-box and/or the 5' UTR intron.

15.
J Struct Funct Genomics ; 3(1-4): 111-6, 2003.
Article in English | MEDLINE | ID: mdl-12836690

ABSTRACT

Gene duplication is considered to be a source of genetic information for the creation of new functions. The Arabidopsis thaliana genome sequence revealed that a majority of plant genes belong to gene families. Regarding the problem of genes involved in the genesis of novel organs or functions during evolution, the reconstitution of the evolutionary history of gene families is of critical importance. A comparison of the intron/exon gene structure may provide clues for the understanding of the evolutionary mechanisms underlying the genesis of gene families. An extensive study of A. thaliana genome showed that families of duplicated genes may be organized according to the number and/or density of intron and the diversity in gene structure. In this paper, we propose a genomic classification of several A. thaliana gene families based on introns in an evolutionary perspective.


Subject(s)
Evolution, Molecular , Introns , Multigene Family , Plants/genetics , Arabidopsis/genetics , Gene Duplication , Gene Transfer, Horizontal/genetics
16.
EMBO Rep ; 3(12): 1152-7, 2002 Dec.
Article in English | MEDLINE | ID: mdl-12446565

ABSTRACT

A statistical analysis of 9000 flanking sequence tags characterizing transferred DNA (T-DNA) transformants in Arabidopsis sheds new light on T-DNA insertion by illegitimate recombination. T-DNA integration is favoured in plant DNA regions with an A-T-rich content. The formation of a short DNA duplex between the host DNA and the left end of the T-DNA sets the frame for the recombination. The sequence immediately downstream of the plant A-T-rich region is the master element for setting up the DNA duplex, and deletions into the left end of the integrated T-DNA depend on the location of a complementary sequence on the T-DNA. Recombination at the right end of the T-DNA with the host DNA involves another DNA duplex, 2-3 base pairs long, that preferentially includes a G close to the right end of the T-DNA.


Subject(s)
Arabidopsis/genetics , DNA/metabolism , Plants, Genetically Modified , 3' Flanking Region , 5' Flanking Region , Arabidopsis/metabolism , Gene Transfer Techniques , Genome, Plant , Sequence Homology
SELECTION OF CITATIONS
SEARCH DETAIL
...