Search | VHL Regional Portal

Improved de novo peptide sequencing using LC retention time information.

Frank, Yves; Hruz, Tomas; Tschager, Thomas; Venzin, Valentin.

Algorithms Mol Biol ; 13: 14, 2018.

Article in English | MEDLINE | ID: mdl-30181767

ABSTRACT

BACKGROUND: Liquid chromatography combined with tandem mass spectrometry is an important tool in proteomics for peptide identification. Liquid chromatography temporally separates the peptides in a sample. The peptides that elute one after another are analyzed via tandem mass spectrometry by measuring the mass-to-charge ratio of a peptide and its fragments. De novo peptide sequencing is the problem of reconstructing the amino acid sequences of a peptide from this measurement data. Past de novo sequencing algorithms solely consider the mass spectrum of the fragments for reconstructing a sequence. RESULTS: We propose to additionally exploit the information obtained from liquid chromatography. We study the problem of computing a sequence that is not only in accordance with the experimental mass spectrum, but also with the chromatographic retention time. We consider three models for predicting the retention time and develop algorithms for de novo sequencing for each model. CONCLUSIONS: Based on an evaluation for two prediction models on experimental data from synthesized peptides we conclude that the identification rates are improved by exploiting the chromatographic information. In our evaluation, we compare our algorithms using the retention time information with algorithms using the same scoring model, but not the retention time.

ExpressionData - A public resource of high quality curated datasets representing gene expression across anatomy, development and experimental conditions.

Zimmermann, Philip; Bleuler, Stefan; Laule, Oliver; Martin, Florian; Ivanov, Nikolai V; Campanoni, Prisca; Oishi, Karen; Lugon-Moulin, Nicolas; Wyss, Markus; Hruz, Tomas; Gruissem, Wilhelm.

BioData Min ; 7: 18, 2014.

Article in English | MEDLINE | ID: mdl-25228922

ABSTRACT

Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at http://www.expressiondata.org.

Global regulatory architecture of human, mouse and rat tissue transcriptomes.

Prasad, Ajay; Kumar, Suchitra Suresh; Dessimoz, Christophe; Bleuler, Stefan; Laule, Oliver; Hruz, Tomas; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Genomics ; 14: 716, 2013 Oct 20.

Article in English | MEDLINE | ID: mdl-24138449

ABSTRACT

BACKGROUND: Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species. RESULTS: Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species. CONCLUSION: The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.

Subject(s)

Transcriptome , Animals , Cluster Analysis , Genome , Genome, Human , Humans , Mice , Multigene Family , Oligonucleotide Array Sequence Analysis , Principal Component Analysis , Rats , Species Specificity

A multilevel gamma-clustering layout algorithm for visualization of biological networks.

Hruz, Tomas; Wyss, Markus; Lucas, Christoph; Laule, Oliver; von Rohr, Peter; Zimmermann, Philip; Bleuler, Stefan.

Adv Bioinformatics ; 2013: 920325, 2013.

Article in English | MEDLINE | ID: mdl-23864855

ABSTRACT

Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel Î³ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.

RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization.

Hruz, Tomas; Wyss, Markus; Docquier, Mylene; Pfaffl, Michael W; Masanetz, Sabine; Borghi, Lorenzo; Verbrugghe, Phebe; Kalaydjieva, Luba; Bleuler, Stefan; Laule, Oliver; Descombes, Patrick; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Genomics ; 12: 156, 2011 Mar 21.

Article in English | MEDLINE | ID: mdl-21418615

ABSTRACT

BACKGROUND: RT-qPCR is a sensitive and increasingly used method for gene expression quantification. To normalize RT-qPCR measurements between samples, most laboratories use endogenous reference genes as internal controls. There is increasing evidence, however, that the expression of commonly used reference genes can vary significantly in certain contexts. RESULTS: Using the Genevestigator database of normalized and well-annotated microarray experiments, we describe the expression stability characteristics of the transciptomes of several organisms. The results show that a) no genes are universally stable, b) most commonly used reference genes yield very high transcript abundances as compared to the entire transcriptome, and c) for each biological context a subset of stable genes exists that has smaller variance than commonly used reference genes or genes that were selected for their stability across all conditions. CONCLUSION: We therefore propose the normalization of RT-qPCR data using reference genes that are specifically chosen for the conditions under study. RefGenes is a community tool developed for that purpose. Validation RT-qPCR experiments across several organisms showed that the candidates proposed by RefGenes generally outperformed commonly used reference genes. RefGenes is available within Genevestigator at http://www.genevestigator.com.

Subject(s)

Gene Expression Profiling/methods , Reverse Transcriptase Polymerase Chain Reaction/standards , Software , Algorithms , Animals , Arabidopsis/genetics , Cattle , Computational Biology/methods , Databases, Genetic , Female , Gene Expression Profiling/standards , Humans , Mice , Oligonucleotide Array Sequence Analysis , Reference Standards , Reverse Transcriptase Polymerase Chain Reaction/methods , Swine , User-Computer Interface

Higher-order distributions and nongrowing complex networks without multiple connections.

Hruz, Tomas; Natora, Michal; Agrawal, Madhuresh.

Phys Rev E Stat Nonlin Soft Matter Phys ; 77(4 Pt 2): 046101, 2008 Apr.

Article in English | MEDLINE | ID: mdl-18517684

ABSTRACT

We study stochastic processes that generate nongrowing complex networks without self-loops and multiple edges (simple graphs). The work concentrates on understanding and formulation of constraints which keep the rewiring stochastic processes within the class of simple graphs. To formulate these constraints a different concept of wedge distribution (paths of length 2) is introduced and its relation to degree-degree correlation is studied. The analysis shows that the constraints, together with edge selection rules, do not even allow the formulation of a closed master equation in the general case. We also introduce a particular stochastic process which does not contain edge selection rules, but which, we believe, can provide some insight into the complexities of simple graphs.

Genevestigator transcriptome meta-analysis and biomarker search using rice and barley gene expression databases.

Zimmermann, Philip; Laule, Oliver; Schmitz, Josy; Hruz, Tomas; Bleuler, Stefan; Gruissem, Wilhelm.

Mol Plant ; 1(5): 851-7, 2008 Sep.

Article in English | MEDLINE | ID: mdl-19825587

ABSTRACT

The wide-spread use of microarray technologies to study plant transcriptomes has led to important discoveries and to an accumulation of profiling data covering a wide range of different tissues, developmental stages, perturbations, and genotypes. Querying a large number of microarray experiments can provide insights that cannot be gained by analyzing single experiments. However, such a meta-analysis poses significant challenges with respect to data comparability and normalization, systematic sample annotation, and analysis tools. Genevestigator addresses these issues using a large curated expression database and a set of specifically developed analysis tools that are accessible over the internet. This combination has already proven to be useful in the area of plant research based on a large set of Arabidopsis data (Grennan, 2006). Here, we present the release of the Genevestigator rice and barley gene expression databases that contain quality-controlled and well annotated microarray experiments using ontologies. The databases currently comprise experiments from pathology, plant nutrition, abiotic stress, hormone treatment, genotype, and spatial or temporal analysis, but are expected to cover a broad variety of research areas as more experimental data become available. The transcriptome meta-analysis of the model species rice and barley is expected to deliver results that can be used for functional genomics and biotechnological applications in cereals.

Subject(s)

Biomarkers/analysis , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling , Hordeum/genetics , Oryza/genetics , Software , Arabidopsis/genetics , Gene Expression Regulation, Plant , Oligonucleotide Array Sequence Analysis

Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes.

Hruz, Tomas; Laule, Oliver; Szabo, Gabor; Wessendorp, Frans; Bleuler, Stefan; Oertle, Lukas; Widmayer, Peter; Gruissem, Wilhelm; Zimmermann, Philip.

Adv Bioinformatics ; 2008: 420747, 2008.

Article in English | MEDLINE | ID: mdl-19956698

ABSTRACT

The Web-based software tool Genevestigator provides powerful tools for biologists to explore gene expression across a wide variety of biological contexts. Its first releases, however, were limited by the scaling ability of the system architecture, multiorganism data storage and analysis capability, and availability of computationally intensive analysis methods. Genevestigator V3 is a novel meta-analysis system resulting from new algorithmic and software development using a client/server architecture, large-scale manual curation and quality control of microarray data for several organisms, and curation of pathway data for mouse and Arabidopsis. In addition to improved querying features, Genevestigator V3 provides new tools to analyze the expression of genes in many different contexts, to identify biomarker genes, to cluster genes into expression modules, and to model expression responses in the context of metabolic and regulatory networks. Being a reference expression database with user-friendly tools, Genevestigator V3 facilitates discovery research and hypothesis validation.

Web-based analysis of the mouse transcriptome using Genevestigator.

Laule, Oliver; Hirsch-Hoffmann, Matthias; Hruz, Tomas; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Bioinformatics ; 7: 311, 2006 Jun 21.

Article in English | MEDLINE | ID: mdl-16790046

ABSTRACT

BACKGROUND: Gene function analysis often requires a complex and laborious sequence of laboratory and computer-based experiments. Choosing an effective experimental design generally results from hypotheses derived from prior knowledge or experimentation. Knowledge obtained from meta-analyzing compendia of expression data with annotation libraries can provide significant clues in understanding gene and network function, resulting in better hypotheses that can be tested in the laboratory. DESCRIPTION: Genevestigator is a microarray database and analysis system allowing context-driven queries. Simple but powerful tools allow biologists with little computational background to retrieve information about when, where and how genes are expressed. We manually curated and quality-controlled 3110 mouse Affymetrix arrays from public repositories. Data queries can be run against an annotation library comprising 160 anatomy categories, 12 developmental stage groups, 80 stimuli, and 182 genetic backgrounds or modifications. The quality of results obtained through Genevestigator is illustrated by a number of biological scenarios that are substantiated by other types of experimentation in the literature. CONCLUSION: The Genevestigator-Mouse database effectively provides biologically meaningful results and can be accessed at https://www.genevestigator.ethz.ch.

Subject(s)

Databases, Genetic , Information Storage and Retrieval , Internet , Transcription, Genetic , Animals , Database Management Systems , Gene Expression Profiling , Mice , Oligonucleotide Array Sequence Analysis/methods , Reproducibility of Results , User-Computer Interface

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL