Pesquisa | Portal Regional da BVS

Improved de novo peptide sequencing using LC retention time information.

Frank, Yves; Hruz, Tomas; Tschager, Thomas; Venzin, Valentin.

Algorithms Mol Biol ; 13: 14, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30181767

RESUMO

BACKGROUND: Liquid chromatography combined with tandem mass spectrometry is an important tool in proteomics for peptide identification. Liquid chromatography temporally separates the peptides in a sample. The peptides that elute one after another are analyzed via tandem mass spectrometry by measuring the mass-to-charge ratio of a peptide and its fragments. De novo peptide sequencing is the problem of reconstructing the amino acid sequences of a peptide from this measurement data. Past de novo sequencing algorithms solely consider the mass spectrum of the fragments for reconstructing a sequence. RESULTS: We propose to additionally exploit the information obtained from liquid chromatography. We study the problem of computing a sequence that is not only in accordance with the experimental mass spectrum, but also with the chromatographic retention time. We consider three models for predicting the retention time and develop algorithms for de novo sequencing for each model. CONCLUSIONS: Based on an evaluation for two prediction models on experimental data from synthesized peptides we conclude that the identification rates are improved by exploiting the chromatographic information. In our evaluation, we compare our algorithms using the retention time information with algorithms using the same scoring model, but not the retention time.

ExpressionData - A public resource of high quality curated datasets representing gene expression across anatomy, development and experimental conditions.

Zimmermann, Philip; Bleuler, Stefan; Laule, Oliver; Martin, Florian; Ivanov, Nikolai V; Campanoni, Prisca; Oishi, Karen; Lugon-Moulin, Nicolas; Wyss, Markus; Hruz, Tomas; Gruissem, Wilhelm.

BioData Min ; 7: 18, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25228922

RESUMO

Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at http://www.expressiondata.org.

Global regulatory architecture of human, mouse and rat tissue transcriptomes.

Prasad, Ajay; Kumar, Suchitra Suresh; Dessimoz, Christophe; Bleuler, Stefan; Laule, Oliver; Hruz, Tomas; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Genomics ; 14: 716, 2013 Oct 20.

Artigo em Inglês | MEDLINE | ID: mdl-24138449

RESUMO

BACKGROUND: Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species. RESULTS: Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species. CONCLUSION: The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.

Assuntos

Transcriptoma , Animais , Análise por Conglomerados , Genoma , Genoma Humano , Humanos , Camundongos , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Componente Principal , Ratos , Especificidade da Espécie

A multilevel gamma-clustering layout algorithm for visualization of biological networks.

Hruz, Tomas; Wyss, Markus; Lucas, Christoph; Laule, Oliver; von Rohr, Peter; Zimmermann, Philip; Bleuler, Stefan.

Adv Bioinformatics ; 2013: 920325, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23864855

RESUMO

Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel Î³ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.

RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization.

Hruz, Tomas; Wyss, Markus; Docquier, Mylene; Pfaffl, Michael W; Masanetz, Sabine; Borghi, Lorenzo; Verbrugghe, Phebe; Kalaydjieva, Luba; Bleuler, Stefan; Laule, Oliver; Descombes, Patrick; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Genomics ; 12: 156, 2011 Mar 21.

Artigo em Inglês | MEDLINE | ID: mdl-21418615

RESUMO

BACKGROUND: RT-qPCR is a sensitive and increasingly used method for gene expression quantification. To normalize RT-qPCR measurements between samples, most laboratories use endogenous reference genes as internal controls. There is increasing evidence, however, that the expression of commonly used reference genes can vary significantly in certain contexts. RESULTS: Using the Genevestigator database of normalized and well-annotated microarray experiments, we describe the expression stability characteristics of the transciptomes of several organisms. The results show that a) no genes are universally stable, b) most commonly used reference genes yield very high transcript abundances as compared to the entire transcriptome, and c) for each biological context a subset of stable genes exists that has smaller variance than commonly used reference genes or genes that were selected for their stability across all conditions. CONCLUSION: We therefore propose the normalization of RT-qPCR data using reference genes that are specifically chosen for the conditions under study. RefGenes is a community tool developed for that purpose. Validation RT-qPCR experiments across several organisms showed that the candidates proposed by RefGenes generally outperformed commonly used reference genes. RefGenes is available within Genevestigator at http://www.genevestigator.com.

Assuntos

Perfilação da Expressão Gênica/métodos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Software , Algoritmos , Animais , Arabidopsis/genética , Bovinos , Biologia Computacional/métodos , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica/normas , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Padrões de Referência , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , Suínos , Interface Usuário-Computador

Higher-order distributions and nongrowing complex networks without multiple connections.

Hruz, Tomas; Natora, Michal; Agrawal, Madhuresh.

Phys Rev E Stat Nonlin Soft Matter Phys ; 77(4 Pt 2): 046101, 2008 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-18517684

RESUMO

We study stochastic processes that generate nongrowing complex networks without self-loops and multiple edges (simple graphs). The work concentrates on understanding and formulation of constraints which keep the rewiring stochastic processes within the class of simple graphs. To formulate these constraints a different concept of wedge distribution (paths of length 2) is introduced and its relation to degree-degree correlation is studied. The analysis shows that the constraints, together with edge selection rules, do not even allow the formulation of a closed master equation in the general case. We also introduce a particular stochastic process which does not contain edge selection rules, but which, we believe, can provide some insight into the complexities of simple graphs.

Genevestigator transcriptome meta-analysis and biomarker search using rice and barley gene expression databases.

Zimmermann, Philip; Laule, Oliver; Schmitz, Josy; Hruz, Tomas; Bleuler, Stefan; Gruissem, Wilhelm.

Mol Plant ; 1(5): 851-7, 2008 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-19825587

RESUMO

The wide-spread use of microarray technologies to study plant transcriptomes has led to important discoveries and to an accumulation of profiling data covering a wide range of different tissues, developmental stages, perturbations, and genotypes. Querying a large number of microarray experiments can provide insights that cannot be gained by analyzing single experiments. However, such a meta-analysis poses significant challenges with respect to data comparability and normalization, systematic sample annotation, and analysis tools. Genevestigator addresses these issues using a large curated expression database and a set of specifically developed analysis tools that are accessible over the internet. This combination has already proven to be useful in the area of plant research based on a large set of Arabidopsis data (Grennan, 2006). Here, we present the release of the Genevestigator rice and barley gene expression databases that contain quality-controlled and well annotated microarray experiments using ontologies. The databases currently comprise experiments from pathology, plant nutrition, abiotic stress, hormone treatment, genotype, and spatial or temporal analysis, but are expected to cover a broad variety of research areas as more experimental data become available. The transcriptome meta-analysis of the model species rice and barley is expected to deliver results that can be used for functional genomics and biotechnological applications in cereals.

Assuntos

Biomarcadores/análise , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Hordeum/genética , Oryza/genética , Software , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Análise de Sequência com Séries de Oligonucleotídeos

Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes.

Hruz, Tomas; Laule, Oliver; Szabo, Gabor; Wessendorp, Frans; Bleuler, Stefan; Oertle, Lukas; Widmayer, Peter; Gruissem, Wilhelm; Zimmermann, Philip.

Adv Bioinformatics ; 2008: 420747, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-19956698

RESUMO

The Web-based software tool Genevestigator provides powerful tools for biologists to explore gene expression across a wide variety of biological contexts. Its first releases, however, were limited by the scaling ability of the system architecture, multiorganism data storage and analysis capability, and availability of computationally intensive analysis methods. Genevestigator V3 is a novel meta-analysis system resulting from new algorithmic and software development using a client/server architecture, large-scale manual curation and quality control of microarray data for several organisms, and curation of pathway data for mouse and Arabidopsis. In addition to improved querying features, Genevestigator V3 provides new tools to analyze the expression of genes in many different contexts, to identify biomarker genes, to cluster genes into expression modules, and to model expression responses in the context of metabolic and regulatory networks. Being a reference expression database with user-friendly tools, Genevestigator V3 facilitates discovery research and hypothesis validation.

Web-based analysis of the mouse transcriptome using Genevestigator.

Laule, Oliver; Hirsch-Hoffmann, Matthias; Hruz, Tomas; Gruissem, Wilhelm; Zimmermann, Philip.

BMC Bioinformatics ; 7: 311, 2006 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-16790046

RESUMO

BACKGROUND: Gene function analysis often requires a complex and laborious sequence of laboratory and computer-based experiments. Choosing an effective experimental design generally results from hypotheses derived from prior knowledge or experimentation. Knowledge obtained from meta-analyzing compendia of expression data with annotation libraries can provide significant clues in understanding gene and network function, resulting in better hypotheses that can be tested in the laboratory. DESCRIPTION: Genevestigator is a microarray database and analysis system allowing context-driven queries. Simple but powerful tools allow biologists with little computational background to retrieve information about when, where and how genes are expressed. We manually curated and quality-controlled 3110 mouse Affymetrix arrays from public repositories. Data queries can be run against an annotation library comprising 160 anatomy categories, 12 developmental stage groups, 80 stimuli, and 182 genetic backgrounds or modifications. The quality of results obtained through Genevestigator is illustrated by a number of biological scenarios that are substantiated by other types of experimentation in the literature. CONCLUSION: The Genevestigator-Mouse database effectively provides biologically meaningful results and can be accessed at https://www.genevestigator.ethz.ch.

Assuntos

Bases de Dados Genéticas , Armazenamento e Recuperação da Informação , Internet , Transcrição Gênica , Animais , Sistemas de Gerenciamento de Base de Dados , Perfilação da Expressão Gênica , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reprodutibilidade dos Testes , Interface Usuário-Computador

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA