Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
NAR Genom Bioinform ; 4(1): lqac013, 2022 Mar.
Article in English | MEDLINE | ID: mdl-35211671

ABSTRACT

We introduce a new framework for genome analyses based on parsing an annotated genome assembly into distinct interval loci (iLoci), available as open-source software as part of the AEGeAn Toolkit (https://github.com/BrendelGroup/AEGeAn). We demonstrate that iLoci provide an alternative coordinate system that is robust to changes in assembly and annotation versions and facilitates granular quality control of genome data. We discuss how statistics computed on iLoci reflect various characteristics of genome content and organization and illustrate how these statistics can be used to establish a baseline for assessment of the completeness and accuracy of the data. We also introduce a well-defined measure of relative genome compactness and compute other iLocus statistics that reveal genome-wide characteristics of gene arrangements in the whole genome context. Given the fast pace of assembly/annotation updates, our AEGeAn Toolkit fills a niche in computational genomics based on deriving persistent and species-specific genome statistics. Gene structure model-centric iLoci provide a precisely defined coordinate system that can be used to store assembly/annotation updates that reflect either stable or changed assessments. Large-scale application of the approach revealed species- and clade-specific genome organization in precisely defined computational terms, promising intriguing forays into the forces of shaping genome structure as more and more genome assemblies are being deposited.

2.
Mol Ecol Resour ; 22(4): 1656-1674, 2022 May.
Article in English | MEDLINE | ID: mdl-34861105

ABSTRACT

DNA methylation is a common epigenetic signalling tool and an important biological process which is widely studied in a large array of species. The presence, level and function of DNA methylation vary greatly across species. In some insects, DNA methylation systems are minimal, and overall methylation rates tend to be low in all studied insect species. Low methylation levels probed by whole-genome bisulphite sequencing require great care with respect to data quality control and interpretation. Here, we introduce BWASP/R, a complete workflow that allows efficient, scalable and entirely reproducible analyses of raw DNA methylation sequencing data. Consistent application of quality control filters and analysis parameters provides fair comparisons among different studies and an integrated view of all experiments on one species. We describe the capabilities of the BWASP/R workflow by re-analysing several publicly available social insect WGBS data sets, comprising 70 samples and cumulatively 147 replicates from four different species. We show that the CpG methylome comprises only about 1.5% of CpG sites in the honeybee genome and that the cumulative data are consistent with genetic signatures of site accessibility and physiological control of methylation levels.


Subject(s)
DNA Methylation , Epigenomics , Animals , CpG Islands/genetics , Epigenomics/methods , High-Throughput Nucleotide Sequencing/methods , Insecta/genetics , Sequence Analysis, DNA/methods , Whole Genome Sequencing/methods
3.
NAR Genom Bioinform ; 3(2): lqab051, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34250478

ABSTRACT

Heterogeneity in transcription initiation has important consequences for transcript stability and translation, and shifts in transcription start site (TSS) usage are prevalent in various developmental, metabolic, and disease contexts. Accordingly, numerous methods for global TSS profiling have been developed, including most recently Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq), a method to profile transcription start sites (TSSs) on a genome-wide scale with significant cost and time savings compared to previous methods. In anticipation of more widespread adoption of STRIPE-seq and related methods for construction of promoter atlases and studies of differential gene expression, we built TSRexploreR, an R package for end-to-end analysis of TSS mapping data. TSRexploreR provides functions for TSS and transcription start region (TSR) detection, normalization, correlation, visualization, and differential TSS/TSR analyses. TSRexploreR is highly interoperable, accepting the data structures of TSS and TSR sets generated by several existing tools for processing and alignment of TSS mapping data, such as CAGEr for Cap Analysis of Gene Expression (CAGE) data. Lastly, TSRexploreR implements a novel approach for the detection of shifts in TSS distribution.

4.
Plant Commun ; 2(3): 100164, 2021 05 10.
Article in English | MEDLINE | ID: mdl-34027391

ABSTRACT

Many plant disease resistance (R) genes function specifically in reaction to the presence of cognate effectors from a pathogen. Xanthomonas oryzae pathovar oryzae (Xoo) uses transcription activator-like effectors (TALes) to target specific rice genes for expression, thereby promoting host susceptibility to bacterial blight. Here, we report the molecular characterization of Xa7, the cognate R gene to the TALes AvrXa7 and PthXo3, which target the rice major susceptibility gene SWEET14. Xa7 was mapped to a unique 74-kb region. Gene expression analysis of the region revealed a candidate gene that contained a putative AvrXa7 effector binding element (EBE) in its promoter and encoded a 113-amino-acid peptide of unknown function. Genome editing at the Xa7 locus rendered the plants susceptible to avrXa7-carrying Xoo strains. Both AvrXa7 and PthXo3 activated a GUS reporter gene fused with the EBE-containing Xa7 promoter in Nicotiana benthamiana. The EBE of Xa7 is a close mimic of the EBE of SWEET14 for TALe-induced disease susceptibility. Ectopic expression of Xa7 triggers cell death in N. benthamiana. Xa7 is prevalent in indica rice accessions from 3000 rice genomes. Xa7 appears to be an adaptation that protects against pathogen exploitation of SWEET14 and disease susceptibility.


Subject(s)
Gene Expression Regulation, Plant , Genes, vpr , Oryza/genetics , Plant Diseases/microbiology , Plant Proteins/genetics , Xanthomonas/physiology , Amino Acid Sequence , Base Sequence , Disease Resistance/genetics , Oryza/metabolism , Oryza/microbiology , Plant Breeding , Plant Proteins/chemistry , Plant Proteins/metabolism , Sequence Alignment , Xanthomonas/genetics
5.
Genome Res ; 30(6): 910-923, 2020 06.
Article in English | MEDLINE | ID: mdl-32660958

ABSTRACT

Accurate mapping of transcription start sites (TSSs) is key for understanding transcriptional regulation. However, current protocols for genome-wide TSS profiling are laborious and/or expensive. We present Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq), a simple, rapid, and cost-effective protocol for sequencing capped RNA 5' ends from as little as 50 ng total RNA. Including depletion of uncapped RNA and reaction cleanups, a STRIPE-seq library can be constructed in about 5 h. We show application of STRIPE-seq to TSS profiling in yeast and human cells and show that it can also be effectively used for quantification of transcript levels and analysis of differential gene expression. In conjunction with our ready-to-use computational workflows, STRIPE-seq is a straightforward, efficient means by which to probe the landscape of transcriptional initiation.


Subject(s)
Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA , Transcription Initiation, Genetic , Transcriptome , Cluster Analysis , Computational Biology/methods , Gene Expression Profiling/methods , Gene Ontology , High-Throughput Nucleotide Sequencing/methods , Humans , Promoter Regions, Genetic , Sequence Analysis, RNA/methods , Transcription Initiation Site , Yeasts/genetics
6.
BMC Bioinformatics ; 20(1): 371, 2019 Jul 02.
Article in English | MEDLINE | ID: mdl-31266441

ABSTRACT

BACKGROUND: The falling cost of next-generation sequencing technology has allowed deep sequencing across related species and of individuals within species. Whole genome assemblies from these data remain high time- and resource-consuming computational tasks, particularly if best solutions are sought using different assembly strategies and parameter sets. However, in many cases, the underlying research questions are not genome-wide but rather target specific genes or sets of genes. We describe a novel assembly tool, SRAssembler, that efficiently assembles only contigs containing potential homologs of a gene or protein query, thus enabling gene-specific genome studies over large numbers of short read samples. RESULTS: We demonstrate the functionality of SRAssembler with examples largely drawn from plant genomics. The workflow implements a recursive strategy by which relevant reads are successively pulled from the input sets based on overlapping significant matches, resulting in virtual chromosome walking. The typical workflow behavior is illustrated with assembly of simulated reads. Applications to real data show that SRAssembler produces homologous contigs of equivalent quality to whole genome assemblies. Settings can be chosen to not only assemble presumed orthologs but also paralogous gene loci in distinct contigs. A key application is assembly of the same locus in many individuals from population genome data, which provides assessment of structural variation beyond what can be inferred from read mapping to a reference genome alone. SRAssembler can be used on modest computing resources or used in parallel on high performance computing clusters (most easily by invoking a dedicated Singularity image). CONCLUSIONS: SRAssembler offers an efficient tool to complement whole genome assembly software. It can be used to solve gene-specific research questions based on large genomic read samples from multiple sources and would be an expedient choice when whole genome assembly from the reads is either not feasible, too costly, or unnecessary. The program can also aid decision making on the depth of sequencing in an ongoing novel genome sequencing project or with respect to ultimate whole genome assembly strategies.


Subject(s)
Genomics/methods , Software , Arabidopsis/genetics , Genetic Loci , Genome, Plant , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
7.
Mol Ecol ; 28(8): 1975-1993, 2019 04.
Article in English | MEDLINE | ID: mdl-30809873

ABSTRACT

Social insects provide systems for studying epigenetic regulation of phenotypes, particularly with respect to differentiation of reproductive and worker castes, which typically arise from a common genetic background. The role of gene expression in caste specialization has been extensively studied, but the role of DNA methylation remains controversial. Here, we perform well replicated, integrated analyses of DNA methylation and gene expression in brains of an ant (Formica exsecta) with distinct female castes using traditional approaches (tests of differential methylation) combined with a novel approach (analysis of co-expression and co-methylation networks). We found differences in expression and methylation profiles between workers and queens at different life stages, as well as some overlap between DNA methylation and expression at the functional level. Large portions of the transcriptome and methylome are organized into "modules" of genes, some significantly associated with phenotypic traits of castes and developmental stages. Several gene co-expression modules are preserved in co-methylation networks, consistent with possible regulation of caste-specific gene expression by DNA methylation. Surprisingly, brain co-expression modules were highly preserved when compared with a previous study that examined whole-body co-expression patterns in 16 ant species, suggesting that these modules are evolutionarily conserved and for specific functions in various tissues. Altogether, these results suggest that DNA methylation participates in regulation of caste specialization and age-related physiological changes in social insects.


Subject(s)
Ants/genetics , Behavior, Animal , DNA Methylation/genetics , Epigenesis, Genetic , Animals , Ants/growth & development , Brain/growth & development , Brain/metabolism , Female , Gene Expression Regulation, Developmental/genetics , Male , Phenotype , Reproduction/genetics , Transcriptome , Wasps/genetics
8.
Methods Mol Biol ; 1858: 99-116, 2019.
Article in English | MEDLINE | ID: mdl-30414114

ABSTRACT

Application of Transcription Start Site (TSS) profiling technologies, coupled with large-scale next-generation sequencing (NGS) has yielded valuable insights into the location, structure, and activity of promoters across diverse metazoan model systems. In insects, TSS profiling has been used to characterize the promoter architecture of Drosophila melanogaster (Hoskins et al., Genome Res 21(2):182-192, 2011) and subsequently was employed to reveal widespread transposon-driven alternative promoter usage in the fruit fly (Batut et al., Genome Res 23:169-180, 2012).In this chapter we discuss the computational analysis of the experimental data derived from one of TSS profiling methods, RAMPAGE (RNA Annotation and Mapping of Promoters for Analysis of Gene Expression) that can be used for the precise, quantitative identification of promoters in insect genomes. We demonstrate this using the software tools GoRAMPAGE (Brendel and Raborn, GoRAMPAGE-A workflow for promoter detection by 5'-read mapping. https://github.com/BrendelGroup/GoRAMPAGE , 2016) and TSRchitect (Raborn and Brendel, TSRchitect: promoter identification from large-scale TSS profiling data. R Bioconductor package version 1.8.0 [Online]. Available: http://bioconductor.org/packages/release/bioc/html/TSRchitect.html , 2017), providing detailed instructions with the aim of taking the user from raw reads to processed results.


Subject(s)
Computational Biology/methods , Drosophila melanogaster/genetics , Genome, Insect , Molecular Sequence Annotation/methods , Promoter Regions, Genetic , Sequence Analysis, DNA/methods , Software , Animals , High-Throughput Nucleotide Sequencing/methods , Transcription Initiation Site
9.
New Phytol ; 220(3): 659-660, 2018 11.
Article in English | MEDLINE | ID: mdl-30324739
10.
Genetics ; 204(2): 593-612, 2016 Oct.
Article in English | MEDLINE | ID: mdl-27585846

ABSTRACT

Large-scale transcription start site (TSS) profiling produces a high-resolution, quantitative picture of transcription initiation and core promoter locations within a genome. However, application of TSS profiling to date has largely been restricted to a small set of prominent model systems. We sought to characterize the cis-regulatory landscape of the water flea Daphnia pulex, an emerging model arthropod that reproduces both asexually (via parthenogenesis) and sexually (via meiosis). We performed Cap Analysis of Gene Expression (CAGE) with RNA isolated from D. pulex within three developmental states: sexual females, asexual females, and males. Identified TSSs were utilized to generate a "Daphnia Promoter Atlas," i.e., a catalog of active promoters across the surveyed states. Analysis of the distribution of promoters revealed evidence for widespread alternative promoter usage in D. pulex, in addition to a prominent fraction of compactly-arranged promoters in divergent orientations. We carried out de novo motif discovery using CAGE-defined TSSs and identified eight candidate core promoter motifs; this collection includes canonical promoter elements (e.g., TATA and Initiator) in addition to others lacking obvious orthologs. A comparison of promoter activities found evidence for considerable state-specific differential gene expression between states. Our work represents the first global definition of transcription initiation and promoter architecture in crustaceans. The Daphnia Promoter Atlas presented here provides a valuable resource for comparative study of cis-regulatory regions in metazoans, as well as for investigations into the circuitries that underpin meiosis and parthenogenesis.


Subject(s)
Daphnia/genetics , Meiosis/genetics , Promoter Regions, Genetic , Transcription, Genetic , Animals , Daphnia/growth & development , Female , Gene Expression Regulation, Developmental , Male , Parthenogenesis/genetics , Sex Characteristics , Transcription Initiation Site
11.
Plant Cell ; 28(4): 840-54, 2016 04.
Article in English | MEDLINE | ID: mdl-27020957

ABSTRACT

Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching.


Subject(s)
Software , Computational Biology , Genome, Plant/genetics , Workflow
12.
Mol Ecol ; 25(8): 1769-84, 2016 04.
Article in English | MEDLINE | ID: mdl-26859767

ABSTRACT

Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste-related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste-related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these -omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative -omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects.


Subject(s)
DNA Methylation , Genome, Insect , Social Behavior , Transcriptome , Wasps/genetics , Animals , Behavior, Animal , Female , Male
13.
Mol Biol Evol ; 31(3): 605-13, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24356560

ABSTRACT

The high frequency of alternative splicing among the serine/arginine-rich (SR) family of proteins in plants has been linked to important roles in gene regulation during development and in response to environmental stress. In this article, we have searched and manually annotated all the SR proteins in the genomes of maize and sorghum. The experimental validation of gene structure by reverse transcription-polymerase chain reaction (RT-PCR) analysis revealed, with few exceptions, that SR genes produced multiple isoforms of transcripts by alternative splicing. Despite sharing high structural similarity and conserved positions of the introns, the profile of alternative splicing diverged significantly between maize and sorghum for the vast majority of SR genes. These include many transcript isoforms discovered by RT-PCR and not represented in extant expressed sequence tag (EST) collection. However, we report the occurrence of various maize and sorghum SR mRNA isoforms that display evolutionary conservation of splicing events with their homologous SR genes in Arabidopsis and moss. Our data also indicate an important role of both 5' and 3' untranslated regions in the regulation of SR gene expression. These observations have potentially important implications for the processes of evolution and adaptation of plants to land.


Subject(s)
Alternative Splicing/genetics , Conserved Sequence/genetics , Gene Expression Profiling , Gene Expression Regulation, Plant , Nuclear Proteins/genetics , Plant Proteins/genetics , RNA-Binding Proteins/genetics , 3' Untranslated Regions/genetics , Amino Acid Sequence , Arabidopsis/genetics , Bryopsida/genetics , Evolution, Molecular , Exons/genetics , Genetic Variation , Introns/genetics , Molecular Sequence Data , Nuclear Proteins/chemistry , Plant Proteins/chemistry , Plant Proteins/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Proteins/chemistry , Sequence Homology, Nucleic Acid , Serine-Arginine Splicing Factors , Sorghum/genetics , Zea mays/genetics
14.
PLoS One ; 8(10): e77846, 2013.
Article in English | MEDLINE | ID: mdl-24204993

ABSTRACT

The STING (stimulator of interferon genes) protein can bind cyclic dinucleotides to activate the production of type I interferons and inflammatory cytokines. The cyclic dinucleotides can be bacterial second messengers c-di-GMP and c-di-AMP, 3'5'-3'5' cyclic GMP-AMP (3'3' cGAMP) produced by Vibrio cholerae and metazoan second messenger 2'5'-3'5' Cyclic GMP-AMP (2'3' cGAMP). Analysis of single nucleotide polymorphism (SNP) data from the 1000 Genome Project revealed that R71H-G230A-R293Q (HAQ) occurs in 20.4%, R232H in 13.7%, G230A-R293Q (AQ) in 5.2%, and R293Q in 1.5% of human population. In the absence of exogenous ligands, the R232H, R293Q and AQ SNPs had only modest effect on the stimulation of IFN-ß and NF-κB promoter activities in HEK293T cells, while HAQ had significantly lower intrinsic activity. The decrease was primarily due to the R71H substitution. The SNPs also affected the response to the cyclic dinucleotides. In the presence of c-di-GMP, the R232H variant partially decreased the ability to activate IFN-ßsignaling, while it was defective for the response to c-di-AMP and 3'3' cGAMP. The R293Q dramatically decreased the stimulatory response to all bacterial ligands. Surprisingly, the AQ and HAQ variants maintained partial abilities to activate the IFN-ß signaling in the presence of ligands due primarily to the G230A substitution. Biochemical analysis revealed that the recombinant G230A protein could affect the conformation of the C-terminal domain of STING and the binding to c-di-GMP. Comparison of G230A structure with that of WT revealed that the conformation of the lid region that clamps onto the c-di-GMP was significantly altered. These results suggest that hSTING variation can affect innate immune signaling and that the common HAQ haplotype expresses a STING protein with reduced intrinsic signaling activity but retained the ability to response to bacterial cyclic dinucleotides.


Subject(s)
Cyclic GMP/analogs & derivatives , Immunity, Innate/genetics , Interferon Type I/metabolism , Membrane Proteins/genetics , Polymorphism, Single Nucleotide/genetics , Amino Acid Sequence , Blotting, Western , Calorimetry, Differential Scanning , Cyclic GMP/pharmacology , HEK293 Cells , Humans , Interferon Regulatory Factor-3/metabolism , Luciferases/metabolism , Membrane Proteins/immunology , Molecular Sequence Data , NF-kappa B/genetics , NF-kappa B/metabolism , Phosphorylation , Phylogeny , Polymerase Chain Reaction , Promoter Regions, Genetic/genetics , Protein Conformation , Sequence Homology, Amino Acid , Signal Transduction
15.
BMC Bioinformatics ; 13: 187, 2012 Aug 01.
Article in English | MEDLINE | ID: mdl-22852583

ABSTRACT

BACKGROUND: Accurate gene structure annotation is a fundamental but somewhat elusive goal of genome projects, as witnessed by the fact that (model) genomes typically undergo several cycles of re-annotation. In many cases, it is not only different versions of annotations that need to be compared but also different sources of annotation of the same genome, derived from distinct gene prediction workflows. Such comparisons are of interest to annotation providers, prediction software developers, and end-users, who all need to assess what is common and what is different among distinct annotation sources. We developed ParsEval, a software application for pairwise comparison of sets of gene structure annotations. ParsEval calculates several statistics that highlight the similarities and differences between the two sets of annotations provided. These statistics are presented in an aggregate summary report, with additional details provided as individual reports specific to non-overlapping, gene-model-centric genomic loci. Genome browser styled graphics embedded in these reports help visualize the genomic context of the annotations. Output from ParsEval is both easily read and parsed, enabling systematic identification of problematic gene models for subsequent focused analysis. RESULTS: ParsEval is capable of analyzing annotations for large eukaryotic genomes on typical desktop or laptop hardware. In comparison to existing methods, ParsEval exhibits a considerable performance improvement, both in terms of runtime and memory consumption. Reports from ParsEval can provide relevant biological insights into the gene structure annotations being compared. CONCLUSIONS: Implemented in C, ParsEval provides the quickest and most feature-rich solution for genome annotation comparison to date. The source code is freely available (under an ISC license) at http://parseval.sourceforge.net/.


Subject(s)
Computational Biology/methods , Molecular Sequence Annotation/methods , Software , Databases, Genetic , Genomics/methods , Humans
16.
Nucleic Acids Res ; 40(Web Server issue): W117-22, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22693217

ABSTRACT

Transcription activator-like (TAL) effectors are repeat-containing proteins used by plant pathogenic bacteria to manipulate host gene expression. Repeats are polymorphic and individually specify single nucleotides in the DNA target, with some degeneracy. A TAL effector-nucleotide binding code that links repeat type to specified nucleotide enables prediction of genomic binding sites for TAL effectors and customization of TAL effectors for use in DNA targeting, in particular as custom transcription factors for engineered gene regulation and as site-specific nucleases for genome editing. We have developed a suite of web-based tools called TAL Effector-Nucleotide Targeter 2.0 (TALE-NT 2.0; https://boglab.plp.iastate.edu/) that enables design of custom TAL effector repeat arrays for desired targets and prediction of TAL effector binding sites, ranked by likelihood, in a genome, promoterome or other sequence of interest. Search parameters can be set by the user to work with any TAL effector or TAL effector nuclease architecture. Applications range from designing highly specific DNA targeting tools and identifying potential off-target sites to predicting effector targets important in plant disease.


Subject(s)
DNA-Binding Proteins/chemistry , DNA-Binding Proteins/metabolism , Software , Trans-Activators/chemistry , Trans-Activators/metabolism , Algorithms , Binding Sites , DNA/chemistry , DNA/metabolism , Internet , Protein Engineering , Repetitive Sequences, Amino Acid , Sequence Analysis, DNA , User-Computer Interface
17.
J Bacteriol ; 193(19): 5450-64, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21784931

ABSTRACT

Xanthomonas is a large genus of bacteria that collectively cause disease on more than 300 plant species. The broad host range of the genus contrasts with stringent host and tissue specificity for individual species and pathovars. Whole-genome sequences of Xanthomonas campestris pv. raphani strain 756C and X. oryzae pv. oryzicola strain BLS256, pathogens that infect the mesophyll tissue of the leading models for plant biology, Arabidopsis thaliana and rice, respectively, were determined and provided insight into the genetic determinants of host and tissue specificity. Comparisons were made with genomes of closely related strains that infect the vascular tissue of the same hosts and across a larger collection of complete Xanthomonas genomes. The results suggest a model in which complex sets of adaptations at the level of gene content account for host specificity and subtler adaptations at the level of amino acid or noncoding regulatory nucleotide sequence determine tissue specificity.


Subject(s)
Genome, Bacterial/genetics , Xanthomonas/genetics , Arabidopsis/microbiology , Molecular Sequence Data , Oryza/microbiology , Xanthomonas/physiology
18.
Nucleic Acids Res ; 39(Web Server issue): W528-32, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21546552

ABSTRACT

The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet.


Subject(s)
Genomics/methods , Software , Databases, Genetic , Internet , Systems Integration , Workflow
19.
Plant Cell ; 22(6): 1667-85, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20581308

ABSTRACT

The maize (Zea mays) transposable element Dissociation (Ds) was mobilized for large-scale genome mutagenesis and to study its endogenous biology. Starting from a single donor locus on chromosome 10, over 1500 elements were distributed throughout the genome and positioned on the maize physical map. Genetic strategies to enrich for both local and unlinked insertions were used to distribute Ds insertions. Global, regional, and local insertion site trends were examined. We show that Ds transposed to both linked and unlinked sites and displayed a nonuniform distribution on the genetic map around the donor r1-sc:m3 locus. Comparison of Ds and Mutator insertions reveals distinct target preferences, which provide functional complementarity of the two elements for gene tagging in maize. In particular, Ds displays a stronger preference for insertions within exons and introns, whereas Mutator insertions are more enriched in promoters and 5'-untranslated regions. Ds has no strong target site consensus sequence, but we identified properties of the DNA molecule inherent to its local structure that may influence Ds target site selection. We discuss the utility of Ds for forward and reverse genetics in maize and provide evidence that genes within a 2- to 3-centimorgan region flanking Ds insertions will serve as optimal targets for regional mutagenesis.


Subject(s)
DNA Transposable Elements , Genome, Plant , Zea mays/genetics , Chromosome Mapping , Chromosomes, Plant , DNA, Plant/genetics , Mutagenesis, Insertional , Sequence Analysis, DNA
20.
Brief Bioinform ; 10(6): 631-44, 2009 Nov.
Article in English | MEDLINE | ID: mdl-19933210

ABSTRACT

The residence of spliceosomal introns within protein-coding genes can fluctuate over time, with genes gaining, losing or conserving introns in a complex process that is not entirely understood. One approach for studying intron evolution is to compare introns with respect to position and type within closely related genes. Here, we describe new, freely available software called Common Introns Within Orthologous Genes (CIWOG), available at http://ciwog.gdcb.iastate.edu/, which detects common introns in protein-coding genes based on position and sequence conservation in the corresponding protein alignments. CIWOG provides dynamic web displays that facilitate detailed intron studies within orthologous genes. User-supplied options control how introns are clustered into sets of common introns. CIWOG also identifies special classes of introns, in particular those with GC- or U12-type donor sites, which enables analyses of these introns in relation to their counterparts in the other genes in orthologous groups. The software is demonstrated with application to a comprehensive study of eight plant transcriptomes. Three specific examples are discussed: intron class conversion from GT- to GC-donor-type introns in monocots, plant U12-type intron conservation and a global analysis of intron evolution across the eight plant species.


Subject(s)
Algorithms , Chromosome Mapping/methods , DNA, Plant/genetics , Genome, Plant/genetics , Internet , Introns/genetics , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...