Search | VHL Regional Portal

1.

JBrowse: a dynamic web platform for genome visualization and analysis.

Buels, Robert; Yao, Eric; Diesh, Colin M; Hayes, Richard D; Munoz-Torres, Monica; Helt, Gregg; Goodstein, David M; Elsik, Christine G; Lewis, Suzanna E; Stein, Lincoln; Holmes, Ian H.

Genome Biol ; 17: 66, 2016 Apr 12.

Article in English | MEDLINE | ID: mdl-27072794

ABSTRACT

BACKGROUND: JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page. RESULTS: Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets. Analysis functions can readily be added using the plugin framework; most visual aspects of tracks can also be customized, along with clicks, mouseovers, menus, and popup boxes. JBrowse can also be used to browse local annotation files offline and to generate high-resolution figures for publication. CONCLUSIONS: JBrowse is a mature web application suitable for genome visualization and analysis.

Subject(s)

Genomics/methods , Databases, Genetic , Genome , User-Computer Interface , Web Browser

2.

Web Apollo: a web-based genomic annotation editing platform.

Lee, Eduardo; Helt, Gregg A; Reese, Justin T; Munoz-Torres, Monica C; Childers, Chris P; Buels, Robert M; Stein, Lincoln; Holmes, Ian H; Elsik, Christine G; Lewis, Suzanna E.

Genome Biol ; 14(8): R93, 2013 Aug 30.

Article in English | MEDLINE | ID: mdl-24000942

ABSTRACT

Web Apollo is the first instantaneous, collaborative genomic annotation editor available on the web. One of the natural consequences following from current advances in sequencing technology is that there are more and more researchers sequencing new genomes. These researchers require tools to describe the functional features of their newly sequenced genomes. With Web Apollo researchers can use any of the common browsers (for example, Chrome or Firefox) to jointly analyze and precisely describe the features of a genome in real time, whether they are in the same room or working from opposite sides of the world.

Subject(s)

Chromosome Mapping/statistics & numerical data , Genome , Molecular Sequence Annotation/statistics & numerical data , Software , Animals , Birds , Cattle , Databases, Genetic , Genomics , Insecta , Internet , Plants , Sequence Analysis, DNA

3.

Genoviz Software Development Kit: Java tool kit for building genomics visualization applications.

Helt, Gregg A; Nicol, John W; Erwin, Ed; Blossom, Eric; Blanchard, Steven G; Chervitz, Stephen A; Harmon, Cyrus; Loraine, Ann E.

BMC Bioinformatics ; 10: 266, 2009 Aug 25.

Article in English | MEDLINE | ID: mdl-19706180

ABSTRACT

BACKGROUND: Visualization software can expose previously undiscovered patterns in genomic data and advance biological science. RESULTS: The Genoviz Software Development Kit (SDK) is an open source, Java-based framework designed for rapid assembly of visualization software applications for genomics. The Genoviz SDK framework provides a mechanism for incorporating adaptive, dynamic zooming into applications, a desirable feature of genome viewers. Visualization capabilities of the Genoviz SDK include automated layout of features along genetic or genomic axes; support for user interactions with graphical elements (Glyphs) in a map; a variety of Glyph sub-classes that promote experimentation with new ways of representing data in graphical formats; and support for adaptive, semantic zooming, whereby objects change their appearance depending on zoom level and zooming rate adapts to the current scale. Freely available demonstration and production quality applications, including the Integrated Genome Browser, illustrate Genoviz SDK capabilities. CONCLUSION: Separation between graphics components and genomic data models makes it easy for developers to add visualization capability to pre-existing applications or build new applications using third-party data models. Source code, documentation, sample applications, and tutorials are available at http://genoviz.sourceforge.net/.

Subject(s)

Genomics/methods , Image Interpretation, Computer-Assisted/methods , Programming Languages , Software , Computer Graphics , Databases, Factual , Information Storage and Retrieval/methods , User-Computer Interface

4.

The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets.

Nicol, John W; Helt, Gregg A; Blanchard, Steven G; Raja, Archana; Loraine, Ann E.

Bioinformatics ; 25(20): 2730-1, 2009 Oct 15.

Article in English | MEDLINE | ID: mdl-19654113

ABSTRACT

UNLABELLED: Experimental techniques that survey an entire genome demand flexible, highly interactive visualization tools that can display new data alongside foundation datasets, such as reference gene annotations. The Integrated Genome Browser (IGB) aims to meet this need. IGB is an open source, desktop graphical display tool implemented in Java that supports real-time zooming and panning through a genome; layout of genomic features and datasets in moveable, adjustable tiers; incremental or genome-scale data loading from remote web servers or local files; and dynamic manipulation of quantitative data via genome graphs. AVAILABILITY: The application and source code are available from http://igb.bioviz.org and http://genoviz.sourceforge.net.

Subject(s)

Computational Biology/methods , Genome , Genomics/methods , Information Storage and Retrieval/methods , Software , Animals , Databases, Genetic , Humans

5.

RNA maps reveal new RNA classes and a possible function for pervasive transcription.

Kapranov, Philipp; Cheng, Jill; Dike, Sujit; Nix, David A; Duttagupta, Radharani; Willingham, Aarron T; Stadler, Peter F; Hertel, Jana; Hackermüller, Jörg; Hofacker, Ivo L; Bell, Ian; Cheung, Evelyn; Drenkow, Jorg; Dumais, Erica; Patel, Sandeep; Helt, Gregg; Ganesh, Madhavan; Ghosh, Srinka; Piccolboni, Antonio; Sementchenko, Victor; Tammana, Hari; Gingeras, Thomas R.

Science ; 316(5830): 1484-8, 2007 Jun 08.

Article in English | MEDLINE | ID: mdl-17510325

ABSTRACT

Significant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated. Taken together, these observations suggest a novel role for some unannotated RNAs as primary transcripts for the production of short RNAs. Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes. These data support a highly interleaved organization of the human transcriptome.

Subject(s)

Genome, Human , RNA Precursors/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA/genetics , Transcription, Genetic , Animals , Cell Line, Tumor , Cell Nucleus/metabolism , Cytosol/metabolism , Exons , Gene Expression , Genome , HeLa Cells , Humans , Mice , Promoter Regions, Genetic , RNA/metabolism , RNA Precursors/metabolism , Synteny , Terminator Regions, Genetic

6.

Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays.

Kapranov, Philipp; Drenkow, Jorg; Cheng, Jill; Long, Jeffrey; Helt, Gregg; Dike, Sujit; Gingeras, Thomas R.

Genome Res ; 15(7): 987-97, 2005 Jul.

Article in English | MEDLINE | ID: mdl-15998911

ABSTRACT

Recently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene.

Subject(s)

Nucleic Acid Amplification Techniques , Oligonucleotide Array Sequence Analysis , Transcription, Genetic , Cell Line , Gene Expression Profiling , Humans , Jurkat Cells , Models, Genetic , Molecular Sequence Data , Nucleic Acid Amplification Techniques/methods , Oligonucleotide Array Sequence Analysis/methods , Protein Isoforms/genetics , Tumor Cells, Cultured

7.

Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution.

Cheng, Jill; Kapranov, Philipp; Drenkow, Jorg; Dike, Sujit; Brubaker, Shane; Patel, Sandeep; Long, Jeffrey; Stern, David; Tammana, Hari; Helt, Gregg; Sementchenko, Victor; Piccolboni, Antonio; Bekiranov, Stefan; Bailey, Dione K; Ganesh, Madhavan; Ghosh, Srinka; Bell, Ian; Gerhard, Daniela S; Gingeras, Thomas R.

Science ; 308(5725): 1149-54, 2005 May 20.

Article in English | MEDLINE | ID: mdl-15790807

ABSTRACT

Sites of transcription of polyadenylated and nonpolyadenylated RNAs for 10 human chromosomes were mapped at 5-base pair resolution in eight cell lines. Unannotated, nonpolyadenylated transcripts comprise the major proportion of the transcriptional output of the human genome. Of all transcribed sequences, 19.4, 43.7, and 36.9% were observed to be polyadenylated, nonpolyadenylated, and bimorphic, respectively. Half of all transcribed sequences are found only in the nucleus and for the most part are unannotated. Overall, the transcribed portions of the human genome are predominantly composed of interlaced networks of both poly A+ and poly A- annotated transcripts and unannotated transcripts of unknown function. This organization has important implications for interpreting genotype-phenotype associations, regulation of gene expression, and the definition of a gene.

Subject(s)

Chromosomes, Human/genetics , Genome, Human , RNA, Messenger/analysis , Transcription, Genetic , Cell Line , Cell Line, Tumor , Cell Nucleus/metabolism , Chromosomes, Human, Pair 13/genetics , Chromosomes, Human, Pair 14/genetics , Chromosomes, Human, Pair 19/genetics , Chromosomes, Human, Pair 20/genetics , Chromosomes, Human, Pair 21/genetics , Chromosomes, Human, Pair 22/genetics , Chromosomes, Human, Pair 6/genetics , Chromosomes, Human, Pair 7/genetics , Chromosomes, Human, X/genetics , Chromosomes, Human, Y/genetics , Computational Biology , Cytosol/metabolism , DNA, Complementary , DNA, Intergenic , Exons , Female , Humans , Introns , Male , Molecular Sequence Data , Nucleic Acid Amplification Techniques , Oligonucleotide Array Sequence Analysis , Physical Chromosome Mapping , RNA Splicing

8.

Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22.

Kampa, Dione; Cheng, Jill; Kapranov, Philipp; Yamanaka, Mark; Brubaker, Shane; Cawley, Simon; Drenkow, Jorg; Piccolboni, Antonio; Bekiranov, Stefan; Helt, Gregg; Tammana, Hari; Gingeras, Thomas R.

Genome Res ; 14(3): 331-42, 2004 Mar.

Article in English | MEDLINE | ID: mdl-14993201

ABSTRACT

In this report, we have achieved a richer view of the transcriptome for Chromosomes 21 and 22 by using high-density oligonucleotide arrays on cytosolic poly(A)(+) RNA. Conservatively, only 31.4% of the observed transcribed nucleotides correspond to well-annotated genes, whereas an additional 4.8% and 14.7% correspond to mRNAs and ESTs, respectively. Approximately 85% of the known exons were detected, and up to 21% of known genes have only a single isoform based on exon-skipping alternative expression. Overall, the expression of the well-characterized exons falls predominately into two categories, uniquely or ubiquitously expressed with an identifiable proportion of antisense transcripts. The remaining observed transcription (49.0%) was outside of any known annotation. These novel transcripts appear to be more cell-line-specific and have lower and less variation in expression than the well-characterized genes. Novel transcripts were further characterized based on their distance to annotations, transcript size, coding capacity, and identification as antisense to intronic sequences. By RT-PCR, 126 novel transcripts were independently verified, resulting in a 65% verification rate. These observations strongly support the argument for a re-evaluation of the total number of human genes and an alternative term for "gene" to encompass these growing, novel classes of RNA transcripts in the human genome.

Subject(s)

Chromosomes, Human, Pair 21/genetics , Chromosomes, Human, Pair 22/genetics , RNA/genetics , Transcription, Genetic/genetics , Cell Line , Cell Line, Tumor , Chromosome Mapping/methods , DNA, Neoplasm/genetics , Gene Expression Profiling/methods , Genes/genetics , Genes, Neoplasm/genetics , Humans , Jurkat Cells/chemistry , Jurkat Cells/metabolism , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis/methods , Oligonucleotide Probes/genetics , RNA, Messenger/genetics

9.

Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs.

Cawley, Simon; Bekiranov, Stefan; Ng, Huck H; Kapranov, Philipp; Sekinger, Edward A; Kampa, Dione; Piccolboni, Antonio; Sementchenko, Victor; Cheng, Jill; Williams, Alan J; Wheeler, Raymond; Wong, Brant; Drenkow, Jorg; Yamanaka, Mark; Patel, Sandeep; Brubaker, Shane; Tammana, Hari; Helt, Gregg; Struhl, Kevin; Gingeras, Thomas R.

Cell ; 116(4): 499-509, 2004 Feb 20.

Article in English | MEDLINE | ID: mdl-14980218

ABSTRACT

Using high-density oligonucleotide arrays representing essentially all nonrepetitive sequences on human chromosomes 21 and 22, we map the binding sites in vivo for three DNA binding transcription factors, Sp1, cMyc, and p53, in an unbiased manner. This mapping reveals an unexpectedly large number of transcription factor binding site (TFBS) regions, with a minimal estimate of 12,000 for Sp1, 25,000 for cMyc, and 1600 for p53 when extrapolated to the full genome. Only 22% of these TFBS regions are located at the 5' termini of protein-coding genes while 36% lie within or immediately 3' to well-characterized genes and are significantly correlated with noncoding RNAs. A significant number of these noncoding RNAs are regulated in response to retinoic acid, and overlapping pairs of protein-coding and noncoding RNAs are often coregulated. Thus, the human genome contains roughly comparable numbers of protein-coding and noncoding genes that are bound by common transcription factors and regulated by common environmental signals.

Subject(s)

Chromosomes, Human, Pair 21 , Chromosomes, Human, Pair 22 , Transcription Factors/metabolism , Amino Acid Motifs , Binding Sites , Cell Line , Chromatin/metabolism , Chromosome Mapping , CpG Islands , Exons , Expressed Sequence Tags , Genome, Human , Humans , Jurkat Cells , Models, Genetic , Polymerase Chain Reaction , Precipitin Tests , Promoter Regions, Genetic , Protein Binding , RNA/chemistry , RNA/metabolism , RNA, Messenger/metabolism , Reverse Transcriptase Polymerase Chain Reaction , Tretinoin/metabolism

10.

Exploring alternative transcript structure in the human genome using blocks and InterPro.

Loraine, Ann E; Helt, Gregg A; Cline, Melissa S; Siani-Rose, Michael A.

J Bioinform Comput Biol ; 1(2): 289-306, 2003 Jul.

Article in English | MEDLINE | ID: mdl-15290774

ABSTRACT

Understanding how alternative splicing affects gene function is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein function. To test this, high-quality exon-intron structures were deduced for over 8000 human genes, including over 1300 (17 percent) that produce multiple transcript variants. A data mining technique (DiffMotif) was developed to identify genes in which transcript variation coincides with changes in conserved motifs between variants. Applying this method, we found that 30 percent of the multi-variant genes in our test set exhibited a differential profile of conserved InterPro and/or BLOCKS motifs across different mRNA variants. To investigate these, a visualization tool (ProtAnnot) that displays amino acid motifs in the context of genomic sequence was developed. Using this tool, genes revealed by the DiffMotif method were analyzed, and when possible, hypotheses regarding the potential role of alternative transcript structure in modulating gene function were developed. Examples of these, including: MEOX1, a homeobox-containing protein; AIRE, involved in auto-immune disease; PLAT, tissue type plasminogen activator; and CD79b, a component of the B-cell receptor complex, are presented. These results demonstrate that amino acid motif databases like BLOCKS and InterPro are useful tools for investigating how alternative transcript structure affects gene function.

Subject(s)

Alternative Splicing/genetics , Chromosome Mapping/methods , Databases, Protein , Genome, Human , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Transcription Factors/genetics , Algorithms , Amino Acid Motifs/genetics , Conserved Sequence , Gene Expression Regulation/genetics , Genetic Variation , Humans , Proteins/chemistry , Proteins/genetics , Structure-Activity Relationship

11.

Visualizing the genome: techniques for presenting human genome data and annotations.

Loraine, Ann E; Helt, Gregg A.

BMC Bioinformatics ; 3: 19, 2002 Jul 30.

Article in English | MEDLINE | ID: mdl-12149135

ABSTRACT

BACKGROUND: In order to take full advantage of the newly available public human genome sequence data and associated annotations, biologists require visualization tools ("genome browsers") that can accommodate the high frequency of alternative splicing in human genes and other complexities. RESULTS: In this article, we describe visualization techniques for presenting human genomic sequence data and annotations in an interactive, graphical format. These techniques include: one-dimensional, semantic zooming to show sequence data alongside gene structures; color-coding exons to indicate frame of translation; adjustable, moveable tiers to permit easier inspection of a genomic scene; and display of protein annotations alongside gene structures to show how alternative splicing impacts protein structure and function. These techniques are illustrated using examples from two genome browser applications: the Neomorphic GeneViewer annotation tool and ProtAnnot, a prototype viewer which shows protein annotations in the context of genomic sequence. CONCLUSION: By presenting techniques for visualizing genomic data, we hope to provide interested software developers with a guide to what features are most likely to meet the needs of biologists as they seek to make sense of the rapidly expanding body of public genomic data and annotations.

Subject(s)

Computational Biology/methods , Computational Biology/statistics & numerical data , Computer Graphics/trends , Database Management Systems/trends , Genome, Human , Alternative Splicing/genetics , Base Sequence/genetics , Base Sequence/physiology , Chromosome Mapping/methods , Databases, Genetic/classification , Databases, Genetic/trends , Exons/genetics , Exons/physiology , Genes/genetics , Genes/physiology , Human Genome Project , Humans , Introns/genetics , Introns/physiology , Protein Isoforms/classification , Protein Isoforms/genetics , Protein Isoforms/physiology , Proteins/classification , Proteins/genetics , Proteins/physiology , Sequence Analysis, DNA

12.

Protein-based analysis of alternative splicing in the human genome.

Loraine, Ann E; Helt, Gregg A; Cline, Melissa S; Siani-Rose, Michael A.

Proc IEEE Comput Soc Bioinform Conf ; 1: 118-24, 2002.

Article in English | MEDLINE | ID: mdl-15838129

ABSTRACT

Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique ("DiffHit") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs. We found that out of a test set of over 1,300 alternatively spliced genes with solved genomic structure, over 30% exhibited a differential profile of conserved InterPro and/or Blocks protein motifs across distinct isoforms. These results suggest that motif databases such as Blocks and InterPro are potentially useful tools for investigating how alternative transcript structure affects gene function.

Subject(s)

Alternative Splicing/genetics , Databases, Protein , Genome, Human , Information Storage and Retrieval/methods , Proteome/genetics , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Algorithms , Database Management Systems , Evolution, Molecular , Gene Expression Profiling/methods , Humans , Protein Isoforms/chemistry , Protein Isoforms/genetics , Proteome/chemistry , Sequence Homology, Amino Acid , Transcription, Genetic/genetics

13.

Visualization techniques for genomic data.

Loraine, Ann E; Helt, Gregg A.

Proc IEEE Comput Soc Bioinform Conf ; 1: 321-6, 2002.

Article in English | MEDLINE | ID: mdl-15838148

ABSTRACT

In order to take full advantage of the newly available public human genome sequence data and associated annotations, biologists require visualization tools that can accommodate the high frequency of alternative splicing in human genes and other complexities. In this article, we describe techniques for presenting human genomic sequence data and annotations in an interactive, graphical format, with the aim of providing developers with a guide to what features are most likely to meet biologists' needs. These techniques include: one-dimensional semantic zooming to show sequence data alongside gene structures; moveable, adjustable tiers; visual encoding of translation frame to show how alternative transcript structure affects encoded proteins; and display of protein domains in the context of genomic sequence to show how alternative splicing impacts protein structure and function.

Subject(s)

Chromosome Mapping/methods , Computer Graphics , Database Management Systems , Information Storage and Retrieval/methods , Sequence Analysis, DNA/methods , User-Computer Interface , Algorithms , Genome , Software

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL