Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Genomics ; 8: 330, 2007 Sep 19.
Article in English | MEDLINE | ID: mdl-17880721

ABSTRACT

BACKGROUND: Soybean, Glycine max (L.) Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. RESULTS: Seventeen BACs representing approximately 2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. CONCLUSION: This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues.


Subject(s)
Genes, Duplicate/genetics , Genome, Plant/genetics , Glycine max/genetics , Physical Chromosome Mapping/methods , Polyploidy , Repetitive Sequences, Nucleic Acid , Sequence Analysis, DNA/methods , Base Sequence/genetics , Chromosomes, Artificial, Bacterial/genetics , Chromosomes, Plant/genetics , Evolution, Molecular , Genetic Markers , Microsatellite Repeats , Phylogeny , Polymorphism, Genetic/genetics , Software , Species Specificity , Synteny/genetics
2.
Genome Biol ; 7(11): R111, 2006.
Article in English | MEDLINE | ID: mdl-17116260

ABSTRACT

The eXtensible Genome Data Broker (xGDB) provides a software infrastructure consisting of integrated tools for the storage, display, and analysis of genome features in their genomic context. Common features include gene structure annotations, spliced alignments, mapping of repetitive sequence, and microarray probes, but the software supports inclusion of any property that can be associated with a genomic location. The xGDB distribution and user support utilities are available online at the xGDB project website, http://xgdb.sourceforge.net/.


Subject(s)
Computational Biology/methods , Genome/genetics , Genomics/methods , Software , Arabidopsis/genetics , Base Sequence , Databases, Genetic , Genome, Plant/genetics , Molecular Sequence Data , RNA, Messenger/genetics , Zea mays/genetics
3.
Genetics ; 174(2): 1017-28, 2006 Oct.
Article in English | MEDLINE | ID: mdl-16888343

ABSTRACT

The paleopolyploid soybean genome was investigated by sequencing homeologous BAC clones anchored by duplicate N-hydroxycinnamoyl/benzoyltransferase (HCBT) genes. The homeologous BACs were genetically mapped to linkage groups C1 and C2. Annotation of the 173,747- and 98,760-bp BACs showed that gene conservation in both order and orientation is high between homeologous regions with only a single gene insertion/deletion and local tandem duplications differing between the regions. The nucleotide sequence conservation extends into intergenic regions as well, probably due to conserved regulatory sequences. Most of the homeologs appear to have a role in either transcription/DNA binding or cellular signaling, suggesting a potential preference for retention of duplicate genes with these functions. Reverse transcriptase-PCR analysis of homeologs showed that in the tissues sampled, most homeologs have not diverged greatly in their transcription profiles. However, four cases of changes in transcription were identified, primarily in the HCBT gene cluster. Because a mapped locus corresponds to a soybean cyst nematode (SCN) QTL, the potential role of HCBT genes in response to SCN is discussed. These results are the first sequenced-based analysis of homeologous BACs in soybean, a diploidized paleopolyploid.


Subject(s)
Chromosomes, Artificial, Bacterial/genetics , Conserved Sequence , Glycine max/genetics , Sequence Homology , Transcription, Genetic , Base Sequence , Chromosome Mapping , Molecular Sequence Data
4.
Genome Biol ; 7(7): R58, 2006.
Article in English | MEDLINE | ID: mdl-16859520

ABSTRACT

Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at http://www.plantgdb.org/prj/yrGATE.


Subject(s)
Databases, Genetic , Eukaryotic Cells , Internet , DNA, Complementary/genetics , Exons , Expressed Sequence Tags , Genes, Plant , Plants/genetics
5.
Plant Physiol ; 139(2): 610-8, 2005 Oct.
Article in English | MEDLINE | ID: mdl-16219921

ABSTRACT

PlantGDB (http://www.plantgdb.org/) is a database of plant molecular sequences. Expressed sequence tag (EST) sequences are assembled into contigs that represent tentative unique genes. EST contigs are functionally annotated with information derived from known protein sequences that are highly similar to the putative translation products. Tentative Gene Ontology terms are assigned to match those of the similar sequences identified. Genome survey sequences are assembled similarly. The resulting genome survey sequence contigs are matched to ESTs and conserved protein homologs to identify putative full-length open reading frame-containing genes, which are subsequently provisionally classified according to established gene family designations. For Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa), the exon-intron boundaries for gene structures are annotated by spliced alignment of ESTs and full-length cDNAs to their respective complete genome sequences. Unique genome browsers have been developed to present all available EST and cDNA evidence for current transcript models (for Arabidopsis, see the AtGDB site at http://www.plantgdb.org/AtGDB/; for rice, see the OsGDB site at http://www.plantgdb.org/OsGDB/). In addition, a number of bioinformatic tools have been integrated at PlantGDB that enable researchers to carry out sequence analyses on-site using both their own data and data residing within the database.


Subject(s)
Databases, Genetic , Genome, Plant , Plants/genetics , Computational Biology , DNA, Plant/genetics , Expressed Sequence Tags , Genomics , Repetitive Sequences, Nucleic Acid , Software
6.
Trends Plant Sci ; 10(1): 9-14, 2005 Jan.
Article in English | MEDLINE | ID: mdl-15642518

ABSTRACT

Uncertainty and inconsistency of gene structure annotation remain limitations on research in the genome era, frustrating both biologists and bioinformaticians, who have to sort out annotation errors for their genes of interest or to generate trustworthy datasets for algorithmic development. It is unrealistic to hope for better software solutions in the near future that would solve all the problems. The issue is all the more urgent with more species being sequenced and analyzed by comparative genomics - erroneous annotations could easily propagate, whereas correct annotations in one species will greatly facilitate annotation of novel genomes. We propose a dynamic, economically feasible solution to the annotation predicament: broad-based, web-technology-enabled community annotation, a prototype of which is now in use for Arabidopsis.


Subject(s)
Databases, Factual , Genome, Plant , Arabidopsis/genetics , Base Sequence , Chromosome Mapping , Molecular Sequence Data , Sequence Alignment
7.
Nucleic Acids Res ; 32(Database issue): D354-9, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14681433

ABSTRACT

PlantGDB (http://www.plantgdb.org/) is a database of molecular sequence data for all plant species with significant sequencing efforts. The database organizes EST sequences into contigs that represent tentative unique genes. Contigs are annotated and, whenever possible, linked to their respective genomic DNA. Genome sequence fragments are assembled similarly. The goal of the PlantGDB web site is to establish the basis for identifying sets of genes common to all plants or specific to particular species by integrating a number of bioinformatics tools that facilitate gene prediction and cross- species comparisons. For species with large-scale genome sequencing efforts, PlantGDB provides genome browsing capabilities that integrate all available EST and cDNA evidence for current gene models (for Arabidopsis thaliana, see the AtGDB site at http://www.plantgdb.org/AtGDB/).


Subject(s)
Computational Biology , Databases, Genetic , Genes, Plant , Genome, Plant , Expressed Sequence Tags , Genomics , Information Storage and Retrieval , Internet
8.
Plant Physiol ; 132(2): 469-84, 2003 Jun.
Article in English | MEDLINE | ID: mdl-12805580

ABSTRACT

Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.


Subject(s)
Arabidopsis/genetics , Genome, Plant , Alternative Splicing , Base Sequence , Chromosome Mapping , Evolution, Molecular , Expressed Sequence Tags , Molecular Sequence Data , Multigene Family , Sequence Alignment , Sequence Homology, Nucleic Acid
9.
Nucleic Acids Res ; 31(13): 3597-600, 2003 Jul 01.
Article in English | MEDLINE | ID: mdl-12824374

ABSTRACT

The GeneSeqer@PlantGDB Web server (http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi) provides a gene structure prediction tool tailored for applications to plant genomic sequences. Predictions are based on spliced alignment with source-native ESTs and full-length cDNAs or non-native probes derived from putative homologous genes. The tool is illustrated with applications to refinement of current gene structure annotation and de novo annotation of draft genomic sequences. The service should facilitate expert annotation as a community effort by providing convenient access to all public plant sequences via the PlantGDB database, a simple four-step protocol for spliced alignment and visually appealing displays of the predicted gene structures in addition to detailed sequence alignments.


Subject(s)
Genome, Plant , Sequence Analysis, DNA/methods , Software , Arabidopsis/genetics , Computer Graphics , DNA, Complementary/chemistry , DNA, Plant/chemistry , Databases, Genetic , Expressed Sequence Tags , Gene Components , Internet , Plant Proteins/chemistry , Sequence Alignment , Sequence Analysis, Protein , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...