Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 37(Database issue): D755-61, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18996895

ABSTRACT

The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs.


Subject(s)
Databases, Nucleic Acid , Genomics , Animals , Chromosome Mapping , Computer Graphics , Gene Expression , Genetic Variation , Humans , RNA, Messenger/chemistry , Software , User-Computer Interface
2.
Nucleic Acids Res ; 36(Database issue): D773-9, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18086701

ABSTRACT

The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this year's additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.edu/.


Subject(s)
Databases, Nucleic Acid , Genomics , Animals , Computer Graphics , Genetic Variation , Humans , Internet , Invertebrates/genetics , Sequence Alignment , User-Computer Interface , Vertebrates/genetics
3.
Nucleic Acids Res ; 35(Database issue): D668-73, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17142222

ABSTRACT

The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data. The database is optimized for fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. In the past year, 22 new assemblies and several new sets of human variation annotation have been released. New features include VisiGene, a fully integrated in situ hybridization image browser; phyloGif, for drawing evolutionary tree diagrams; a redesigned Custom Track feature; an expanded SNP annotation track; and many new display options. The Genome Browser, other tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.


Subject(s)
Databases, Genetic , Genomics , Animals , Base Sequence , Cattle , Computer Graphics , Conserved Sequence , Genome, Human , Humans , Internet , Linkage Disequilibrium , Mice , Open Reading Frames , Polymorphism, Single Nucleotide , Rats , Regulatory Sequences, Nucleic Acid , User-Computer Interface
4.
Nucleic Acids Res ; 34(Database issue): D590-8, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381938

ABSTRACT

The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, mRNA and expressed sequence tag evidence, comparative genomics, regulation, expression and variation data. The database is optimized to support fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. The Genome Browser displays a wide variety of annotations at all scales from single nucleotide level up to a full chromosome. The Table Browser provides direct access to the database tables and sequence data, enabling complex queries on genome-wide datasets. The Proteome Browser graphically displays protein properties. The Gene Sorter allows filtering and comparison of genes by several metrics including expression data and several gene properties. BLAT and In Silico PCR search for sequences in entire genomes in seconds. These tools are highly integrated and provide many hyperlinks to other databases and websites. The GBD, browsing tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.


Subject(s)
Databases, Genetic , Genomics , Amino Acid Sequence , Animals , California , Computer Graphics , Dogs , Gene Expression , Genes , Humans , Internet , Mice , Polymorphism, Single Nucleotide , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Proteomics , Rats , Sequence Alignment , Software , User-Computer Interface
5.
Nucleic Acids Res ; 31(1): 51-4, 2003 Jan 01.
Article in English | MEDLINE | ID: mdl-12519945

ABSTRACT

The University of California Santa Cruz (UCSC) Genome Browser Database is an up to date source for genome sequence data integrated with a large collection of related annotations. The database is optimized to support fast interactive performance with the web-based UCSC Genome Browser, a tool built on top of the database for rapid visualization and querying of the data at many levels. The annotations for a given genome are displayed in the browser as a series of tracks aligned with the genomic sequence. Sequence data and annotations may also be viewed in a text-based tabular format or downloaded as tab-delimited flat files. The Genome Browser Database, browsing tools and downloadable data files can all be found on the UCSC Genome Bioinformatics website (http://genome.ucsc.edu), which also contains links to documentation and related technical information.


Subject(s)
Databases, Genetic , Genome, Human , Genomics , Animals , California , Database Management Systems , Humans , Information Storage and Retrieval , Mice
8.
Pac Symp Biocomput ; : 263-74, 2001.
Article in English | MEDLINE | ID: mdl-11262946

ABSTRACT

Computer aided sequence analysis is a critical aspect of current biological research. Sequence information from the genome sequencing projects fills databases so quickly that humans cannot examine it all. Hence there is a heavy reliance on computer algorithms to point out the few important nuggets for human examination. Sequence search algorithms range from simple to complex, as does the representation of the biological data. Typically though, simple algorithms are used on the simplest of data representations because of the large computational demands of anything more complex. This leads to missed hits because the simple search techniques are often not sufficiently sensitive. Here we describe the implementation of several sensitive sequence analysis algorithms on the Kestrel parallel processor, a single-instruction multiple-data (SIMD) processor developed and built at UCSC. Performance of the Smith-Waterman and Hidden Markov Model algorithms, with both Viterbi and Expectation Maximization methods ranges from 6 to 20 times faster than standard computers.


Subject(s)
Algorithms , Computers , Sequence Analysis/statistics & numerical data , Databases, Factual , Markov Chains
9.
Proteins ; Suppl 5: 86-91, 2001.
Article in English | MEDLINE | ID: mdl-11835485

ABSTRACT

This article presents results of blind predictions submitted to the CASP4 protein structure prediction experiment. We made two sets of predictions: one using the fully automated SAM-T99 server and one using the improved SAM-T2K method with human intervention. Both methods use iterative hidden Markov model-based methods for constructing protein family profiles, using only sequence information. Although the SAM-T99 method is purely sequence based, the SAM-T2K method uses the predicted secondary structure of the target sequence and the known secondary structure of the templates to improve fold recognition and alignment. In this article, we try to determine what aspects of the SAM-T2K method were responsible for its significantly better performance in the CASP4 experiment in the hopes of producing a better automatic prediction server. The use of secondary structure prediction seems to be the most valuable single improvement, though the combined total of various human interventions is probably at least as important.


Subject(s)
DNA-Binding Proteins , Models, Molecular , Protein Conformation , Adenosine Triphosphatases/chemistry , Bacterial Proteins/chemistry , Computer Simulation , Endodeoxyribonucleases/chemistry , Escherichia coli Proteins/chemistry , Lyases/chemistry , MutS DNA Mismatch-Binding Protein , Neural Networks, Computer , Protein Structure, Tertiary , Repressor Proteins/chemistry , Research Design , Sequence Alignment , Sequence Analysis, Protein
10.
J Comput Biol ; 7(1-2): 95-114, 2000.
Article in English | MEDLINE | ID: mdl-10890390

ABSTRACT

A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines using a new kernel function. The kernel function is derived from a generative statistical model for a protein family, in this case a hidden Markov model. This general approach of combining generative models like HMMs with discriminative methods such as support vector machines may have applications in other areas of biosequence analysis as well.


Subject(s)
Proteins/genetics , Sequence Alignment/statistics & numerical data , Sequence Analysis, Protein/statistics & numerical data , Biometry , Databases, Factual , GTP-Binding Proteins/genetics , Markov Chains , Models, Statistical
11.
Proteins ; Suppl 3: 121-5, 1999.
Article in English | MEDLINE | ID: mdl-10526360

ABSTRACT

This paper presents results of blind predictions submitted to the CASP3 protein structure prediction experiment. We made predictions using the SAM-T98 method, an iterative hidden Markov model-based method for constructing protein family profiles. The method is purely sequence-based, using no structural information, and yet was able to predict structures as well as all but five of the structure-based methods in CASP3.


Subject(s)
Proteins/chemistry , Algorithms , Amino Acid Sequence , Markov Chains , Molecular Sequence Data , Protein Structure, Secondary , Sequence Alignment
12.
Article in English | MEDLINE | ID: mdl-10786297

ABSTRACT

A new method, called the Fisher kernel method, for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines using a new kernel function. The kernel function is derived from a hidden Markov model. The general approach of combining generative models like HMMs with discriminative methods such as support vector machines may have applications in other areas of biosequence analysis as well.


Subject(s)
Proteins/chemistry , Sequence Analysis, Protein/methods , Databases, Factual , GTP-Binding Proteins/chemistry , Markov Chains , Models, Statistical , Protein Structure, Tertiary , Reproducibility of Results , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...