Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
bioRxiv ; 2023 May 16.
Article in English | MEDLINE | ID: mdl-37292896

ABSTRACT

The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains. To capture and compute on the three sources of transcript structure diversity, we introduce a gene and transcript annotation framework that uses triplets representing the transcript start site, exon junction chain, and transcript end site of each transcript. Using triplets in a simplex representation demonstrates how promoter selection, splice pattern, and 3' processing are deployed across human tissues, with nearly half of multi-transcript protein coding genes showing a clear bias toward one of the three diversity mechanisms. Evaluated across samples, the predominantly expressed transcript changes for 74% of protein coding genes. In evolution, the human and mouse transcriptomes are globally similar in types of transcript structure diversity, yet among individual orthologous gene pairs, more than half (57.8%) show substantial differences in mechanism of diversification in matching tissues. This initial large-scale survey of human and mouse long-read transcriptomes provides a foundation for further analyses of alternative transcript usage, and is complemented by short-read and microRNA data on the same samples and by epigenome data elsewhere in the ENCODE4 collection.

2.
Nucleic Acids Res ; 35(Database issue): D468-71, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17142221

ABSTRACT

The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82,000 manually-curated interactions.


Subject(s)
Databases, Protein , Proteomics , Saccharomyces cerevisiae Proteins/chemistry , Computer Graphics , Genome, Fungal , Internet , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics , Sequence Analysis, Protein , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...