Search | VHL Regional Portal

A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni.

Protasio, Anna V; Tsai, Isheng J; Babbage, Anne; Nichol, Sarah; Hunt, Martin; Aslett, Martin A; De Silva, Nishadi; Velarde, Giles S; Anderson, Tim J C; Clark, Richard C; Davidson, Claire; Dillon, Gary P; Holroyd, Nancy E; LoVerde, Philip T; Lloyd, Christine; McQuillan, Jacquelline; Oliveira, Guilherme; Otto, Thomas D; Parker-Manuel, Sophia J; Quail, Michael A; Wilson, R Alan; Zerlotini, Adhemar; Dunne, David W; Berriman, Matthew.

PLoS Negl Trop Dis ; 6(1): e1455, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22253936

ABSTRACT

Schistosomiasis is one of the most prevalent parasitic diseases, affecting millions of people in developing countries. Amongst the human-infective species, Schistosoma mansoni is also the most commonly used in the laboratory and here we present the systematic improvement of its draft genome. We used Sanger capillary and deep-coverage Illumina sequencing from clonal worms to upgrade the highly fragmented draft 380 Mb genome to one with only 885 scaffolds and more than 81% of the bases organised into chromosomes. We have also used transcriptome sequencing (RNA-seq) from four time points in the parasite's life cycle to refine gene predictions and profile their expression. More than 45% of predicted genes have been extensively modified and the total number has been reduced from 11,807 to 10,852. Using the new version of the genome, we identified trans-splicing events occurring in at least 11% of genes and identified clear cases where it is used to resolve polycistronic transcripts. We have produced a high-resolution map of temporal changes in expression for 9,535 genes, covering an unprecedented dynamic range for this organism. All of these data have been consolidated into a searchable format within the GeneDB (www.genedb.org) and SchistoDB (www.schistodb.net) databases. With further transcriptional profiling and genome sequencing increasingly accessible, the upgraded genome will form a fundamental dataset to underpin further advances in schistosome research.

Subject(s)

Genome, Helminth , Schistosoma mansoni/genetics , Transcriptome , Animals , DNA, Helminth/chemistry , DNA, Helminth/genetics , Molecular Sequence Data , RNA, Helminth/genetics , Sequence Analysis, DNA

GeneDB--an annotation database for pathogens.

Logan-Klumpler, Flora J; De Silva, Nishadi; Boehme, Ulrike; Rogers, Matthew B; Velarde, Giles; McQuillan, Jacqueline A; Carver, Tim; Aslett, Martin; Olsen, Christian; Subramanian, Sandhya; Phan, Isabelle; Farris, Carol; Mitra, Siddhartha; Ramasamy, Gowthaman; Wang, Haiming; Tivey, Adrian; Jackson, Andrew; Houston, Robin; Parkhill, Julian; Holden, Matthew; Harb, Omar S; Brunk, Brian P; Myler, Peter J; Roos, David; Carrington, Mark; Smith, Deborah F; Hertz-Fowler, Christiane; Berriman, Matthew.

Nucleic Acids Res ; 40(Database issue): D98-108, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22116062

ABSTRACT

GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.

Subject(s)

Databases, Genetic , Genomics , Molecular Sequence Annotation , Animals , Arthropods/genetics , Genome, Bacterial , Genome, Helminth , Genome, Protozoan , Internet , Vocabulary, Controlled

TriTrypDB: a functional genomic resource for the Trypanosomatidae.

Aslett, Martin; Aurrecoechea, Cristina; Berriman, Matthew; Brestelli, John; Brunk, Brian P; Carrington, Mark; Depledge, Daniel P; Fischer, Steve; Gajria, Bindu; Gao, Xin; Gardner, Malcolm J; Gingle, Alan; Grant, Greg; Harb, Omar S; Heiges, Mark; Hertz-Fowler, Christiane; Houston, Robin; Innamorato, Frank; Iodice, John; Kissinger, Jessica C; Kraemer, Eileen; Li, Wei; Logan, Flora J; Miller, John A; Mitra, Siddhartha; Myler, Peter J; Nayak, Vishal; Pennington, Cary; Phan, Isabelle; Pinney, Deborah F; Ramasamy, Gowthaman; Rogers, Matthew B; Roos, David S; Ross, Chris; Sivam, Dhileep; Smith, Deborah F; Srinivasamoorthy, Ganesh; Stoeckert, Christian J; Subramanian, Sandhya; Thibodeau, Ryan; Tivey, Adrian; Treatman, Charles; Velarde, Giles; Wang, Haiming.

Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.

Article in English | MEDLINE | ID: mdl-19843604

ABSTRACT

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.

Subject(s)

Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Leishmania/genetics , Trypanosoma/genetics , Animals , Computational Biology/trends , Databases, Protein , Genome, Protozoan , Information Storage and Retrieval/methods , Internet , Protein Structure, Tertiary , Protozoan Proteins/genetics , Software , User-Computer Interface

Performing statistical analyses on quantitative data in Taverna workflows: an example using R and maxdBrowse to identify differentially-expressed genes from microarray data.

Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B.

BMC Bioinformatics ; 9: 334, 2008 Aug 07.

Article in English | MEDLINE | ID: mdl-18687127

ABSTRACT

BACKGROUND: There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. RESULTS: Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. CONCLUSION: Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.

Subject(s)

Data Interpretation, Statistical , Gene Expression Profiling/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Software , Databases, Genetic , Information Storage and Retrieval , Programming Languages

MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics.

Spasic, Irena; Dunn, Warwick B; Velarde, Giles; Tseng, Andy; Jenkins, Helen; Hardy, Nigel; Oliver, Stephen G; Kell, Douglas B.

BMC Bioinformatics ; 7: 281, 2006 Jun 05.

Article in English | MEDLINE | ID: mdl-16753052

ABSTRACT

BACKGROUND: The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5-6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions. DESCRIPTION: MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML) are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/. CONCLUSION: The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.

Subject(s)

Database Management Systems , Genomics/methods , Information Storage and Retrieval/methods , Metabolism/physiology , Models, Biological , Proteome/metabolism , Signal Transduction/physiology , Computer Simulation , Systems Integration , User-Computer Interface

maxdLoad2 and maxdBrowse: standards-compliant tools for microarray experimental annotation, data management and dissemination.

Hancock, David; Wilson, Michael; Velarde, Giles; Morrison, Norman; Hayes, Andrew; Hulme, Helen; Wood, A Joseph; Nashar, Karim; Kell, Douglas B; Brass, Andy.

BMC Bioinformatics ; 6: 264, 2005 Nov 03.

Article in English | MEDLINE | ID: mdl-16269077

ABSTRACT

BACKGROUND: maxdLoad2 is a relational database schema and Java application for microarray experimental annotation and storage. It is compliant with all standards for microarray meta-data capture; including the specification of what data should be recorded, extensive use of standard ontologies and support for data exchange formats. The output from maxdLoad2 is of a form acceptable for submission to the ArrayExpress microarray repository at the European Bioinformatics Institute. maxdBrowse is a PHP web-application that makes contents of maxdLoad2 databases accessible via web-browser, the command-line and web-service environments. It thus acts as both a dissemination and data-mining tool. RESULTS: maxdLoad2 presents an easy-to-use interface to an underlying relational database and provides a full complement of facilities for browsing, searching and editing. There is a tree-based visualization of data connectivity and the ability to explore the links between any pair of data elements, irrespective of how many intermediate links lie between them. Its principle novel features are: the flexibility of the meta-data that can be captured, the tools provided for importing data from spreadsheets and other tabular representations, the tools provided for the automatic creation of structured documents, the ability to browse and access the data via web and web-services interfaces. Within maxdLoad2 it is very straightforward to customise the meta-data that is being captured or change the definitions of the meta-data. These meta-data definitions are stored within the database itself allowing client software to connect properly to a modified database without having to be specially configured. The meta-data definitions (configuration file) can also be centralized allowing changes made in response to revisions of standards or terminologies to be propagated to clients without user intervention.maxdBrowse is hosted on a web-server and presents multiple interfaces to the contents of maxd databases. maxdBrowse emulates many of the browse and search features available in the maxdLoad2 application via a web-browser. This allows users who are not familiar with maxdLoad2 to browse and export microarray data from the database for their own analysis. The same browse and search features are also available via command-line and SOAP server interfaces. This both enables scripting of data export for use embedded in data repositories and analysis environments, and allows access to the maxd databases via web-service architectures. CONCLUSION: maxdLoad2 http://www.bioinf.man.ac.uk/microarray/maxd/ and maxdBrowse http://dbk.ch.umist.ac.uk/maxdBrowse are portable and compatible with all common operating systems and major database servers. They provide a powerful, flexible package for annotation of microarray experiments and a convenient dissemination environment. They are available for download and open sourced under the Artistic License.

Subject(s)

Data Interpretation, Statistical , Information Dissemination/methods , Microarray Analysis/instrumentation , Software , Internet , Microarray Analysis/methods , User-Computer Interface

Formaldehyde dehydrogenase preparations from Methylococcus capsulatus (Bath) comprise methanol dehydrogenase and methylene tetrahydromethanopterin dehydrogenase.

Adeosun, Ekundayo K; Smith, Thomas J; Hoberg, Anne-Mette; Velarde, Giles; Ford, Robert; Dalton, Howard.

Microbiology (Reading) ; 150(Pt 3): 707-713, 2004 Mar.

Article in English | MEDLINE | ID: mdl-14993320

ABSTRACT

In methylotrophic bacteria, formaldehyde is an important but potentially toxic metabolic intermediate that can be assimilated into biomass or oxidized to yield energy. Previously reported was the purification of an NAD(P)(+)-dependent formaldehyde dehydrogenase (FDH) from the obligate methane-oxidizing methylotroph Methylococcus capsulatus (Bath), presumably important in formaldehyde oxidation, which required a heat-stable factor (known as the modifin) for FDH activity. Here, the major protein component of this FDH preparation was shown by biophysical techniques to comprise subunits of 64 and 8 kDa in an alpha(2)beta(2) arrangement. N-terminal sequencing of the subunits of FDH, together with enzymological characterization, showed that the alpha(2)beta(2) tetramer was a quinoprotein methanol dehydrogenase of the type found in other methylotrophs. The FDH preparations were shown to contain a highly active NAD(P)(+)-dependent methylene tetrahydromethanopterin dehydrogenase that was the probable source of the NAD(P)(+)-dependent formaldehyde oxidation activity. These results support previous findings that methylotrophs possess multiple pathways for formaldehyde dissimilation.

Subject(s)

Alcohol Oxidoreductases/isolation & purification , Aldehyde Oxidoreductases/isolation & purification , Methylococcus capsulatus/enzymology , Oxidoreductases Acting on CH-NH Group Donors/isolation & purification , Alcohol Oxidoreductases/genetics , Alcohol Oxidoreductases/metabolism , Aldehyde Oxidoreductases/genetics , Aldehyde Oxidoreductases/metabolism , Amino Acid Sequence , Macromolecular Substances , Methylococcus capsulatus/genetics , Molecular Sequence Data , Molecular Weight , Multienzyme Complexes/genetics , Multienzyme Complexes/isolation & purification , Multienzyme Complexes/metabolism , Oxidoreductases Acting on CH-NH Group Donors/genetics , Oxidoreductases Acting on CH-NH Group Donors/metabolism , Protein Subunits , Sequence Homology, Amino Acid

3D structure of the skeletal muscle dihydropyridine receptor.

Wang, Ming-Chuan; Velarde, Giles; Ford, Robert C; Berrow, Nicholas S; Dolphin, Annette C; Kitmitto, Ashraf.

J Mol Biol ; 323(1): 85-98, 2002 Oct 11.

Article in English | MEDLINE | ID: mdl-12368101

ABSTRACT

The dihydropyridine receptors (DHPR) are L-type voltage-gated calcium channels that regulate the flux of calcium ions across the cell membrane. Here we present the three-dimensional (3D) structure at approximately 27A resolution of purified skeletal muscle DHPR, as determined by electron microscopy and single particle analysis. Here both biochemical and 3D structural data indicate that DHPR is dimeric. DHPR dimers are composed of two arch-shaped monomers approximately 210A across and approximately 75A thick, that interact very tightly at each end of the arch. The roughly toroidal structure of the two monomers encloses a cylindrical space of approximately 80A diameter, which is then closed on each side by two dome-shaped protein densities reaching over from each monomer arch. The dome-shaped domains have a length of approximately 80-90A and a maximum height of approximately 45A. Small orifices punctuate their exterior surface. The 3D structure disclosed here may have important implications for the understanding of DHPR Ca(2+) channel function. We also propose a model for its in vivo interactions with the calcium release channel at the junctional sarcoplasmic recticulum.

Subject(s)

Calcium Channels, L-Type/chemistry , Muscle, Skeletal/chemistry , Calcium Channels, L-Type/isolation & purification , Calcium Channels, L-Type/ultrastructure , Electrophoresis, Polyacrylamide Gel , Microscopy, Electron , Protein Conformation , Structure-Activity Relationship

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL