Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-20150671

RESUMO

The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition, they are incomplete at any given time. In this paper, we describe a technique that improves our previous method for predicting novel GO annotations by extracting implicit semantic relationships between genes and functions. In this work, we use a vector space model and a number of weighting schemes in addition to our previous latent semantic indexing approach. The technique described here is able to take into consideration the hierarchical structure of the Gene Ontology (GO) and can weight differently GO terms situated at different depths. The prediction abilities of 15 different weighting schemes are compared and evaluated. Nine such schemes were previously used in other problem domains, while six of them are introduced in this paper. The best weighting scheme was a novel scheme, n2tn. Out of the top 50 functional annotations predicted using this weighting scheme, we found support in the literature for 84 percent of them, while 6 percent of the predictions were contradicted by the existing literature. For the remaining 10 percent, we did not find any relevant publications to confirm or contradict the predictions. The n2tn weighting scheme also outperformed the simple binary scheme used in our previous approach.


Assuntos
Mineração de Dados/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genes/genética , Proteínas/classificação , Proteínas/genética , Semântica , Algoritmos , Mapeamento Cromossômico/métodos , Genoma Humano/genética , Humanos
2.
Genome Res ; 17(10): 1537-45, 2007 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17785539

RESUMO

A common challenge in the analysis of genomics data is trying to understand the underlying phenomenon in the context of all complex interactions taking place on various signaling pathways. A statistical approach using various models is universally used to identify the most relevant pathways in a given experiment. Here, we show that the existing pathway analysis methods fail to take into consideration important biological aspects and may provide incorrect results in certain situations. By using a systems biology approach, we developed an impact analysis that includes the classical statistics but also considers other crucial factors such as the magnitude of each gene's expression change, their type and position in the given pathways, their interactions, etc. The impact analysis is an attempt to a deeper level of statistical analysis, informed by more pathway-specific biology than the existing techniques. On several illustrative data sets, the classical analysis produces both false positives and false negatives, while the impact analysis provides biologically meaningful results. This analysis method has been implemented as a Web-based tool, Pathway-Express, freely available as part of the Onto-Tools (http://vortex.cs.wayne.edu).


Assuntos
Genômica/métodos , Biologia de Sistemas/métodos , Adenocarcinoma/genética , Coagulação Sanguínea/efeitos dos fármacos , Coagulação Sanguínea/genética , Neoplasias da Mama/genética , Linhagem Celular , Ativação do Complemento/efeitos dos fármacos , Ativação do Complemento/genética , Bases de Dados Genéticas , Feminino , Adesões Focais/genética , Perfilação da Expressão Gênica , Genômica/estatística & dados numéricos , Hepatócitos/efeitos dos fármacos , Hepatócitos/metabolismo , Humanos , Neoplasias Pulmonares/genética , Ácido Palmítico/farmacologia , Software , Biologia de Sistemas/estatística & dados numéricos
3.
Nucleic Acids Res ; 33(Web Server issue): W762-5, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15980579

RESUMO

The Onto-Tools suite is composed of an annotation database and six seamlessly integrated, web-accessible data mining tools: Onto-Express, Onto-Compare, Onto-Design, Onto-Translate, Onto-Miner and Pathway-Express. The Onto-Tools database has been expanded to include various types of data from 12 new databases. Our database now integrates different types of genomic data from 19 sequence, gene, protein and annotation databases. Additionally, our database is also expanded to include complete Gene Ontology (GO) annotations. Using the enhanced database and GO annotations, Onto-Express now allows functional profiling for 24 organisms and supports 17 different types of input IDs. Onto-Translate is also enhanced to fully utilize the capabilities of the new Onto-Tools database with an ultimate goal of providing the users with a non-redundant and complete mapping from any type of identification system to any other type. Currently, Onto-Translate allows arbitrary mappings between 29 types of IDs. Pathway-Express is a new tool that helps the users find the most interesting pathways for their input list of genes. Onto-Tools are freely available at http://vortex.cs.wayne.edu/Projects.html.


Assuntos
Bases de Dados Genéticas , Genes , Software , Animais , Perfilação da Expressão Gênica , Genômica , Humanos , Internet , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Ratos , Integração de Sistemas , Interface Usuário-Computador
4.
Bioinformatics ; 21(16): 3416-21, 2005 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-15955782

RESUMO

The correct interpretation of any biological experiment depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are ubiquitous and used by all life scientists in most experiments. However, it is well known that such databases are incomplete and many annotations may also be incorrect. In this paper we describe a technique that can be used to analyze the semantic content of such annotation databases. Our approach is able to extract implicit semantic relationships between genes and functions. This ability allows us to discover novel functions for known genes. This approach is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy. We used our technique to analyze the current annotations of the human genome. From this body of annotations, we were able to predict 212 additional gene-function assignments. A subsequent literature search found that 138 of these gene-functions assignments are supported by existing peer-reviewed papers. An additional 23 assignments have been confirmed in the meantime by the addition of the respective annotations in later releases of the Gene Ontology database. Overall, the 161 confirmed assignments represent 75.95% of the proposed gene-function assignments. Only one of our predictions (0.4%) was contradicted by the existing literature. We could not find any relevant articles for 50 of our predictions (23.58%). The method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database.


Assuntos
Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Documentação/métodos , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Proteínas/classificação , Proteínas/metabolismo , Sistemas de Gerenciamento de Base de Dados , Genoma Humano , Humanos , Proteínas/química , Proteínas/genética , Semântica , Relação Estrutura-Atividade , Vocabulário Controlado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...