Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Inform ; 43(5): 709-15, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20435161

RESUMO

The Gene Expression Omnibus (GEO) is the largest resource of public gene expression data. While GEO enables data browsing, query and retrieval, additional tools can help realize its potential for aggregating and comparing data across multiple studies and platforms. This paper describes DSGeo-a collection of valuable tools that were developed for annotating, aggregating, integrating, and analyzing data deposited in GEO. The core set of tools include a Relational Database, a Data Loader, a Data Browser, and an Expression Combiner and Analyzer. The application enables querying for specific sample characteristics and identifying studies containing samples that match the query. The Expression Combiner application enables normalization and aggregation of data from these samples and returns these data to the user after filtering, according to the user's preferences. The Expression Analyzer allows simple statistical comparisons between groups of data. This seamless integration makes annotated cross-platform data directly available for analysis.


Assuntos
Perfilação da Expressão Gênica , Armazenamento e Recuperação da Informação , Aplicações da Informática Médica , Software , Animais , Biologia Computacional , Redes de Comunicação de Computadores , Humanos , Interface Usuário-Computador
2.
Summit Transl Bioinform ; 2010: 25-9, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21347141

RESUMO

Microarray probes and reads from massively parallel sequencing technologies are two most widely used genomic tags for a transcriptome study. Names and underlying technologies might differ, but expression technologies share a common objective-to obtain mRNA abundance values at the gene level, with high sensitivity and specificity. However, the initial tag annotation becomes obsolete as more insight is gained into biological references (genome, transcriptome, SNP, etc.). While novel alignment algorithms for short reads are being released every month, solutions for rapid annotation of tags are rare. We have developed a generic matching algorithm that uses genomic positions for rapid custom-annotation of tags with a time complexity O(nlogn). We demonstrate our algorithm on the custom annotation of Illumina massively parallel sequencing reads and Affymetrix microarray probes and identification of alternatively spliced regions.

3.
BMC Bioinformatics ; 10 Suppl 9: S10, 2009 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-19761564

RESUMO

BACKGROUND: This study describes a large-scale manual re-annotation of data samples in the Gene Expression Omnibus (GEO), using variables and values derived from the National Cancer Institute thesaurus. A framework is described for creating an annotation scheme for various diseases that is flexible, comprehensive, and scalable. The annotation structure is evaluated by measuring coverage and agreement between annotators. RESULTS: There were 12,500 samples annotated with approximately 30 variables, in each of six disease categories - breast cancer, colon cancer, inflammatory bowel disease (IBD), rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), and Type 1 diabetes mellitus (DM). The annotators provided excellent variable coverage, with known values for over 98% of three critical variables: disease state, tissue, and sample type. There was 89% strict inter-annotator agreement and 92% agreement when using semantic and partial similarity measures. CONCLUSION: We show that it is possible to perform manual re-annotation of a large repository in a reliable manner.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Algoritmos , Armazenamento e Recuperação da Informação
4.
BMC Bioinformatics ; 10 Suppl 9: S9, 2009 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-19761579

RESUMO

BACKGROUND: Large repositories of biomedical research data are most useful to translational researchers if their data can be aggregated for efficient queries and analyses. However, inconsistent or non-existent annotations describing important sample details such as name of tissue or cell line, histopathological type, and subject characteristics like demographics, treatment, and survival are seldom present in data repositories, making it difficult to aggregate data. RESULTS: We created a flexible software tool that allows efficient annotation of samples using a controlled vocabulary, and report on its use for the annotation of over 12,500 samples. CONCLUSION: While the amount of data is very large and seemingly poorly annotated, a lot of information is still within reach. Consistent tool-based re-annotation enables many new possibilities for large scale interpretation and analyses that would otherwise be impossible.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Software , Bases de Dados Genéticas , Vocabulário Controlado
5.
Proteomics ; 7(17): 3051-4, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17683051

RESUMO

De novo peptide sequencing algorithms are often tested on relatively small data sets made of excellent spectra. Since there are always more and more tandem mass spectra available, we have assembled six large, reliable, and diverse (three mass spectrometer types) data sets intended for such tests and we make them accessible via a web server. To exemplify their use we investigate the performance of Lutefisk, PepNovo, and PepNovoTag, three well-established peptide de novo sequencing programs.


Assuntos
Algoritmos , Fragmentos de Peptídeos/química , Proteômica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Animais , Biologia Computacional/métodos , Humanos , Espectrometria de Massas em Tandem/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...