Search | VHL Regional Portal

Show: 20 | 50 | 100

Results 1 - 10 de 10

Filter

Expression Atlas update: insights from sequencing data at both bulk and single cell level.

George, Nancy; Fexova, Silvie; Fuentes, Alfonso Munoz; Madrigal, Pedro; Bi, Yalan; Iqbal, Haider; Kumbham, Upendra; Nolte, Nadja Francesca; Zhao, Lingyun; Thanki, Anil S; Yu, Iris D; Marugan Calles, Jose C; Erdos, Karoly; Vilmovsky, Liora; Kurri, Sandeep R; Vathrakokoili-Pournara, Anna; Osumi-Sutherland, David; Prakash, Ananth; Wang, Shengbo; Tello-Ruiz, Marcela K; Kumari, Sunita; Ware, Doreen; Goutte-Gattat, Damien; Hu, Yanhui; Brown, Nick; Perrimon, Norbert; Vizcaíno, Juan Antonio; Burdett, Tony; Teichmann, Sarah; Brazma, Alvis; Papatheodorou, Irene.

Nucleic Acids Res ; 52(D1): D107-D114, 2024 Jan 05.

Article in English | MEDLINE | ID: mdl-37992296

ABSTRACT

Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.

Subject(s)

Databases, Genetic , Gene Expression Profiling , Proteomics , Genotype , Metadata , Single-Cell Analysis , Internet , Humans , Animals

Convergent Loss of an EDS1/PAD4 Signaling Pathway in Several Plant Lineages Reveals Coevolved Components of Plant Immunity and Drought Response.

Baggs, Erin L; Monroe, J Grey; Thanki, Anil S; O'Grady, Ruby; Schudoma, Christian; Haerty, Wilfried; Krasileva, Ksenia V.

Plant Cell ; 32(7): 2158-2177, 2020 07.

Article in English | MEDLINE | ID: mdl-32409319

ABSTRACT

Plant innate immunity relies on nucleotide binding leucine-rich repeat receptors (NLRs) that recognize pathogen-derived molecules and activate downstream signaling pathways. We analyzed the variation in NLR gene copy number and identified plants with a low number of NLR genes relative to sister species. We specifically focused on four plants from two distinct lineages, one monocot lineage (Alismatales) and one eudicot lineage (Lentibulariaceae). In these lineages, the loss of NLR genes coincides with loss of the well-known downstream immune signaling complex ENHANCED DISEASE SUSCEPTIBILITY 1 (EDS1)/PHYTOALEXIN DEFICIENT 4 (PAD4). We expanded our analysis across whole proteomes and found that other characterized immune genes were absent only in Lentibulariaceae and Alismatales. Additionally, we identified genes of unknown function that were convergently lost together with EDS1/PAD4 in five plant species. Gene expression analyses in Arabidopsis (Arabidopsis thaliana) and Oryza sativa revealed that several homologs of the candidates are differentially expressed during pathogen infection, drought, and abscisic acid treatment. Our analysis provides evolutionary evidence for the rewiring of plant immunity in some plant lineages, as well as the coevolution of the EDS1/PAD4 pathway and drought responses.

Subject(s)

Alismatales/genetics , NLR Proteins/genetics , Plant Immunity/genetics , Plant Proteins/genetics , Alismatales/immunology , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Carboxylic Ester Hydrolases/genetics , DNA-Binding Proteins/genetics , Disease Resistance/genetics , Disease Resistance/immunology , Droughts , Evolution, Molecular , Gene Dosage , Gene Expression Regulation, Plant , Magnoliopsida/genetics , Magnoliopsida/immunology , Oryza/genetics , Phylogeny , Signal Transduction , Synteny

Aequatus: an open-source homology browser.

Thanki, Anil S; Soranzo, Nicola; Herrero, Javier; Haerty, Wilfried; Davey, Robert P.

Gigascience ; 7(11)2018 11 01.

Article in English | MEDLINE | ID: mdl-30395211

ABSTRACT

Background: Phylogenetic information inferred from the study of homologous genes helps us to understand the evolution of genes and gene families, including the identification of ancestral gene duplication events as well as regions under positive or purifying selection within lineages. Gene family and orthogroup characterization enables the identification of syntenic blocks, which can then be visualized with various tools. Unfortunately, currently available tools display only an overview of syntenic regions as a whole, limited to the gene level, and none provide further details about structural changes within genes, such as the conservation of ancestral exon boundaries amongst multiple genomes. Findings: We present Aequatus, an open-source web-based tool that provides an in-depth view of gene structure across gene families, with various options to render and filter visualizations. It relies on precalculated alignment and gene feature information typically held in, but not limited to, the Ensembl Compara and Core databases. We also offer Aequatus.js, a reusable JavaScript module that fulfills the visualization aspects of Aequatus, available within the Galaxy web platform as a visualization plug-in, which can be used to visualize gene trees generated by the GeneSeqToFamily workflow.

Subject(s)

Computational Biology/methods , Genome/genetics , Genomics/methods , Software , Information Storage and Retrieval/methods , Internet , Phylogeny , Proteins/classification , Proteins/genetics , Reproducibility of Results , Sequence Alignment/methods

ViCTree: an automated framework for taxonomic classification from protein sequences.

Modha, Sejal; Thanki, Anil S; Cotmore, Susan F; Davison, Andrew J; Hughes, Joseph.

Bioinformatics ; 34(13): 2195-2200, 2018 07 01.

Article in English | MEDLINE | ID: mdl-29474519

ABSTRACT

Motivation: The increasing rate of submission of genetic sequences into public databases is providing a growing resource for classifying the organisms that these sequences represent. To aid viral classification, we have developed ViCTree, which automatically integrates the relevant sets of sequences in NCBI GenBank and transforms them into an interactive maximum likelihood phylogenetic tree that can be updated automatically. ViCTree incorporates ViCTreeView, which is a JavaScript-based visualization tool that enables the tree to be explored interactively in the context of pairwise distance data. Results: To demonstrate utility, ViCTree was applied to subfamily Densovirinae of family Parvoviridae. This led to the identification of six new species of insect virus. Availability and implementation: ViCTree is open-source and can be run on any Linux- or Unix-based computer or cluster. A tutorial, the documentation and the source code are available under a GPL3 license, and can be accessed at http://bioinformatics.cvr.ac.uk/victree_web/. Supplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Phylogeny , Sequence Analysis, Protein/methods , Software , Viral Proteins/metabolism , Viruses/genetics , Amino Acid Sequence , Databases, Factual , Viral Proteins/chemistry , Viral Proteins/genetics , Viruses/metabolism

GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline.

Thanki, Anil S; Soranzo, Nicola; Haerty, Wilfried; Davey, Robert P.

Gigascience ; 7(3): 1-10, 2018 03 01.

Article in English | MEDLINE | ID: mdl-29425291

ABSTRACT

Background: Gene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. Findings: A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. Conclusions: GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.

Subject(s)

Computational Biology , Genome/genetics , Phylogeny , Software , Algorithms , User-Computer Interface , Workflow

transPLANT Resources for Triticeae Genomic Data.

Spannagl, Manuel; Alaux, Michael; Lange, Matthias; Bolser, Daniel M; Bader, Kai C; Letellier, Thomas; Kimmel, Erik; Flores, Raphael; Pommier, Cyril; Kerhornou, Arnaud; Walts, Brandon; Nussbaumer, Thomas; Grabmuller, Christoph; Chen, Jinbo; Colmsee, Christian; Beier, Sebastian; Mascher, Martin; Schmutzer, Thomas; Arend, Daniel; Thanki, Anil; Ramirez-Gonzalez, Ricardo; Ayling, Martin; Ayling, Sarah; Caccamo, Mario; Mayer, Klaus F X; Scholz, Uwe; Steinbach, Delphine; Quesneville, Hadi; Kersey, Paul J.

Plant Genome ; 9(1)2016 03.

Article in English | MEDLINE | ID: mdl-27898761

ABSTRACT

The genome sequences of many important Triticeae species, including bread wheat ( L.) and barley ( L.), remained uncharacterized for a long time because their high repeat content, large sizes, and polyploidy. As a result of improvements in sequencing technologies and novel analyses strategies, several of these have recently been deciphered. These efforts have generated new insights into Triticeae biology and genome organization and have important implications for downstream usage by breeders, experimental biologists, and comparative genomicists. transPLANT () is an EU-funded project aimed at constructing hardware, software, and data infrastructure for genome-scale research in the life sciences. Since the Triticeae data are intrinsically complex, heterogenous, and distributed, the transPLANT consortium has undertaken efforts to develop common data formats and tools that enable the exchange and integration of data from distributed resources. Here we present an overview of the individual Triticeae genome resources hosted by transPLANT partners, introduce the objectives of transPLANT, and outline common developments and interfaces supporting integrated data access.

Subject(s)

Genome, Plant , Genomics/methods , Poaceae/genetics , Evolution, Molecular , Hordeum/genetics , Polyploidy , Triticum/genetics

Anatomy of BioJS, an open source community for the life sciences.

Yachdav, Guy; Goldberg, Tatyana; Wilzbach, Sebastian; Dao, David; Shih, Iris; Choudhary, Saket; Crouch, Steve; Franz, Max; García, Alexander; García, Leyla J; Grüning, Björn A; Inupakutika, Devasena; Sillitoe, Ian; Thanki, Anil S; Vieira, Bruno; Villaveces, José M; Schneider, Maria V; Lewis, Suzanna; Pettifer, Steve; Rost, Burkhard; Corpas, Manuel.

Elife ; 42015 Jul 08.

Article in English | MEDLINE | ID: mdl-26153621

ABSTRACT

BioJS is an open source software project that develops visualization tools for different types of biological data. Here we report on the factors that influenced the growth of the BioJS user and developer community, and outline our strategy for building on this growth. The lessons we have learned on BioJS may also be relevant to other open source software projects.

Subject(s)

Biological Science Disciplines/methods , Computational Biology/methods , Software

BioJS: an open source standard for biological visualisation - its status in 2014.

Corpas, Manuel; Jimenez, Rafael; Carbon, Seth J; García, Alex; Garcia, Leyla; Goldberg, Tatyana; Gomez, John; Kalderimis, Alexis; Lewis, Suzanna E; Mulvany, Ian; Pawlik, Aleksandra; Rowland, Francis; Salazar, Gustavo; Schreiber, Fabian; Sillitoe, Ian; Spooner, William H; Thanki, Anil S; Villaveces, José M; Yachdav, Guy; Hermjakob, Henning.

F1000Res ; 3: 55, 2014.

Article in English | MEDLINE | ID: mdl-25075290

ABSTRACT

BioJS is a community-based standard and repository of functional components to represent biological information on the web. The development of BioJS has been prompted by the growing need for bioinformatics visualisation tools to be easily shared, reused and discovered. Its modular architecture makes it easy for users to find a specific functionality without needing to know how it has been built, while components can be extended or created for implementing new functionality. The BioJS community of developers currently provides a range of functionality that is open access and freely available. A registry has been set up that categorises and provides installation instructions and testing facilities at http://www.ebi.ac.uk/tools/biojs/. The source code for all components is available for ready use at https://github.com/biojs/biojs.

wigExplorer, a BioJS component to visualise wig data.

Thanki, Anil S; Jimenez, Rafael C; Kaithakottil, Gemy G; Corpas, Manuel; Davey, Robert P.

F1000Res ; 3: 53, 2014.

Article in English | MEDLINE | ID: mdl-27781080

ABSTRACT

wigExplorer is a BioJS component whose main purpose is to provide a platform for visualisation of wig-formatted data. Wig files are extensively used by genome browsers such as the UCSC Genome Browser. wigExplorer follows the BioJS standard specification, requiring a simple configuration and installation. wigExplorer provides an easy way to navigate the visible region of the canvas and allows interaction with other components via predefined events. Availability: http://github.com/biojs/biojs; http://dx.doi.org/10.5281/zenodo.7721.

10.

StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics.

Ramirez-Gonzalez, Ricardo H; Leggett, Richard M; Waite, Darren; Thanki, Anil; Drou, Nizar; Caccamo, Mario; Davey, Robert.

F1000Res ; 2: 248, 2013.

Article in English | MEDLINE | ID: mdl-24627795

ABSTRACT

Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. "provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month". The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL