Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 50(D1): D988-D995, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34791404

ABSTRACT

Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.


Subject(s)
Databases, Genetic , Genome/genetics , Molecular Sequence Annotation , Software , Animals , Computational Biology/classification , Humans
2.
Nucleic Acids Res ; 49(D1): D884-D891, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33137190

ABSTRACT

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid , Genomics/methods , SARS-CoV-2/genetics , Vertebrates/genetics , Animals , COVID-19/epidemiology , COVID-19/virology , Humans , Internet , Molecular Sequence Annotation/methods , Pandemics , Vertebrates/classification
3.
Nucleic Acids Res ; 48(D1): D682-D688, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31691826

ABSTRACT

The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year.


Subject(s)
Computational Biology/methods , Databases, Genetic , Epigenome , Molecular Sequence Annotation , Algorithms , Animals , Computer Graphics , Databases, Protein , Genetic Variation , Genome-Wide Association Study , Genomics , Histones/metabolism , Humans , Imaging, Three-Dimensional , Internet , Ligands , Search Engine , Software , Species Specificity , Transcriptome , User-Computer Interface , Web Browser
4.
Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407521

ABSTRACT

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.


Subject(s)
Databases, Genetic , Genome/genetics , Genomics , Vertebrates/genetics , Animals , Computational Biology/trends , Humans , Mice , Molecular Sequence Annotation , Software
5.
Nucleic Acids Res ; 46(D1): D754-D761, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29155950

ABSTRACT

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.


Subject(s)
Databases, Genetic , Datasets as Topic , Genome , Information Dissemination , Animals , Epigenomics , Genome, Human , Genome-Wide Association Study , Genomics , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Vertebrates/genetics , Web Browser
6.
Nucleic Acids Res ; 45(D1): D635-D642, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899575

ABSTRACT

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genomics/methods , Search Engine , Software , Web Browser , Animals , Data Mining , Evolution, Molecular , Gene Expression Regulation , Genetic Variation , Genome, Human , Humans , Molecular Sequence Annotation , Species Specificity , Vertebrates
7.
Nucleic Acids Res ; 44(D1): D710-6, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26687719

ABSTRACT

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license.


Subject(s)
Databases, Genetic , Genomics , Molecular Sequence Annotation , Animals , Genes , Genetic Variation , Humans , Internet , Mice , Proteins/genetics , Rats , Regulatory Sequences, Nucleic Acid , Software
8.
Genome Biol ; 16: 56, 2015 Mar 24.
Article in English | MEDLINE | ID: mdl-25887522

ABSTRACT

Most genomic variants associated with phenotypic traits or disease do not fall within gene coding regions, but in regulatory regions, rendering their interpretation difficult. We collected public data on epigenetic marks and transcription factor binding in human cell types and used it to construct an intuitive summary of regulatory regions in the human genome. We verified it against independent assays for sensitivity. The Ensembl Regulatory Build will be progressively enriched when more data is made available. It is freely available on the Ensembl browser, from the Ensembl Regulation MySQL database server and in a dedicated track hub.


Subject(s)
Databases, Genetic , Genomics , Software , Transcription Factors/genetics , Computational Biology , Epigenesis, Genetic/genetics , Humans , Internet , User-Computer Interface
9.
Nucleic Acids Res ; 43(Database issue): D662-9, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25352552

ABSTRACT

Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license.


Subject(s)
Databases, Nucleic Acid , Genomics , Animals , Epigenesis, Genetic , Genetic Variation , Genome, Human , Humans , Internet , Mice , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid , Software
10.
Nucleic Acids Res ; 42(Database issue): D749-55, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24316576

ABSTRACT

Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms and farm animals. Over the past year we have increased the number of species that we support to 77 and expanded our genome browser with a new scrollable overview and improved variation and phenotype views. We also report updates to our core datasets and improvements to our gene homology relationships from the addition of new species. Our REST service has been extended with additional support for comparative genomics and ontology information. Finally, we provide updated information about our methods for data access and resources for user training.


Subject(s)
Databases, Genetic , Genomics , Animals , Chordata/genetics , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Phenotype , Rats
11.
Bioinformatics ; 30(7): 1008-9, 2014 Apr 01.
Article in English | MEDLINE | ID: mdl-24363377

ABSTRACT

MOTIVATION: Using high-throughput sequencing, researchers are now generating hundreds of whole-genome assays to measure various features such as transcription factor binding, histone marks, DNA methylation or RNA transcription. Displaying so much data generally leads to a confusing accumulation of plots. We describe here a multithreaded library that computes statistics on large numbers of datasets (Wiggle, BigWig, Bed, BigBed and BAM), generating statistical summaries within minutes with limited memory requirements, whether on the whole genome or on selected regions. AVAILABILITY AND IMPLEMENTATION: The code is freely available under Apache 2.0 license at www.github.com/Ensembl/Wiggletools


Subject(s)
Genome , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Genomic Library , Internet , Software
12.
Nucleic Acids Res ; 41(Database issue): D48-55, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203987

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Rats , Software , Zebrafish/genetics
13.
Nucleic Acids Res ; 39(Database issue): D705-11, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21081561

ABSTRACT

Binary subcomplexes in proteins database (BISC) is a new protein-protein interaction (PPI) database linking up the two communities most active in their characterization: structural biology and functional genomics researchers. The BISC resource offers users (i) a structural perspective and related information about binary subcomplexes (i.e. physical direct interactions between proteins) that are either structurally characterized or modellable entries in the main functional genomics PPI databases BioGRID, IntAct and HPRD; (ii) selected web services to further investigate the validity of postulated PPI by inspection of their hypothetical modelled interfaces. Among other uses we envision that this resource can help identify possible false positive PPI in current database records. BISC is freely available at http://bisc.cse.ucsc.edu.


Subject(s)
Databases, Protein , Multiprotein Complexes/chemistry , Databases, Genetic , Genomics , Models, Molecular , Multiprotein Complexes/genetics , Protein Interaction Mapping
SELECTION OF CITATIONS
SEARCH DETAIL
...