Pesquisa | Portal Regional da BVS (teste)

The Comparative Toxicogenomics Database's 10th year anniversary: update 2015.

Davis, Allan Peter; Grondin, Cynthia J; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; King, Benjamin L; Wiegers, Thomas C; Mattingly, Carolyn J.

Nucleic Acids Res ; 43(Database issue): D914-20, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25326323

RESUMO

Ten years ago, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) was developed out of a need to formalize, harmonize and centralize the information on numerous genes and proteins responding to environmental toxic agents across diverse species. CTD's initial approach was to facilitate comparisons of nucleotide and protein sequences of toxicologically significant genes by curating these sequences and electronically annotating them with chemical terms from their associated references. Since then, however, CTD has vastly expanded its scope to robustly represent a triad of chemical-gene, chemical-disease and gene-disease interactions that are manually curated from the scientific literature by professional biocurators using controlled vocabularies, ontologies and structured notation. Today, CTD includes 24 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, Gene Ontology annotations, pathways and interaction modules. In this 10th year anniversary update, we outline the evolution of CTD, including our increased data content, new 'Pathway View' visualization tool, enhanced curation practices, pilot chemical-phenotype results and impending exposure data set. The prototype database originally described in our first report has transformed into a sophisticated resource used actively today to help scientists develop and test hypotheses about the etiologies of environmentally influenced diseases.

Assuntos

Bases de Dados de Compostos Químicos , Toxicogenética , Bases de Dados de Compostos Químicos/história , Doença/etiologia , Doença/genética , Genômica/história , História do Século XXI , Internet , Fenótipo , Toxicogenética/história

A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions.

Davis, Allan Peter; Wiegers, Thomas C; Roberts, Phoebe M; King, Benjamin L; Lay, Jean M; Lennon-Hopkins, Kelley; Sciaky, Daniela; Johnson, Robin; Keating, Heather; Greene, Nigel; Hernandez, Robert; McConnell, Kevin J; Enayetallah, Ahmed E; Mattingly, Carolyn J.

Database (Oxford) ; 2013: bat080, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24288140

RESUMO

Improving the prediction of chemical toxicity is a goal common to both environmental health research and pharmaceutical drug development. To improve safety detection assays, it is critical to have a reference set of molecules with well-defined toxicity annotations for training and validation purposes. Here, we describe a collaboration between safety researchers at Pfizer and the research team at the Comparative Toxicogenomics Database (CTD) to text mine and manually review a collection of 88,629 articles relating over 1,200 pharmaceutical drugs to their potential involvement in cardiovascular, neurological, renal and hepatic toxicity. In 1 year, CTD biocurators curated 254,173 toxicogenomic interactions (152,173 chemical-disease, 58,572 chemical-gene, 5,345 gene-disease and 38,083 phenotype interactions). All chemical-gene-disease interactions are fully integrated with public CTD, and phenotype interactions can be downloaded. We describe Pfizer's text-mining process to collate the articles, and CTD's curation strategy, performance metrics, enhanced data content and new module to curate phenotype information. As well, we show how data integration can connect phenotypes to diseases. This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events. The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival. This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities. Database URL: http://ctdbase.org/

Assuntos

Comportamento Cooperativo , Mineração de Dados , Bases de Dados Factuais , Indústria Farmacêutica , Preparações Farmacêuticas/metabolismo , Publicações , Toxicogenética , Doença , Humanos , Fenótipo

Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database.

Davis, Allan Peter; Wiegers, Thomas C; Johnson, Robin J; Lay, Jean M; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; Murphy, Cynthia Grondin; Mattingly, Carolyn J.

PLoS One ; 8(4): e58201, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23613709

RESUMO

The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS), wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel). Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency.

Assuntos

Mineração de Dados/métodos , Bases de Dados Factuais , Doença/genética , Anotação de Sequência Molecular , Publicações , Toxicogenética , Algoritmos , Documentação , Humanos , Metais Pesados/toxicidade , Reprodutibilidade dos Testes

The Comparative Toxicogenomics Database: update 2013.

Davis, Allan Peter; Murphy, Cynthia Grondin; Johnson, Robin; Lay, Jean M; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; King, Benjamin L; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J.

Nucleic Acids Res ; 41(Database issue): D1104-14, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23093600

RESUMO

The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between environmental chemicals and gene products and their relationships to diseases. Chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature are integrated to generate expanded networks and predict many novel associations between different data types. CTD now contains over 15 million toxicogenomic relationships. To navigate this sea of data, we added several new features, including DiseaseComps (which finds comparable diseases that share toxicogenomic profiles), statistical scoring for inferred gene-disease and pathway-chemical relationships, filtering options for several tools to refine user analysis and our new Gene Set Enricher (which provides biological annotations that are enriched for gene sets). To improve data visualization, we added a Cytoscape Web view to our ChemComps feature, included color-coded interactions and created a 'slim list' for our MEDIC disease vocabulary (allowing diseases to be grouped for meta-analysis, visualization and better data management). CTD continues to promote interoperability with external databases by providing content and cross-links to their sites. Together, this wealth of expanded chemical-gene-disease data, combined with novel ways to analyze and view content, continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases.

Assuntos

Bases de Dados de Compostos Químicos , Toxicogenética , Gráficos por Computador , Doença/genética , Internet , Software

Targeted journal curation as a method to improve data currency at the Comparative Toxicogenomics Database.

Davis, Allan Peter; Johnson, Robin J; Lennon-Hopkins, Kelley; Sciaky, Daniela; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J.

Database (Oxford) ; 2012: bas051, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-23221299

RESUMO

The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and manually curate a triad of chemical-gene, chemical-disease and gene-disease interactions. Typically, articles for CTD are selected using a chemical-centric approach by querying PubMed to retrieve a corpus containing the chemical of interest. Although this technique ensures adequate coverage of knowledge about the chemical (i.e. data completeness), it does not necessarily reflect the most current state of all toxicological research in the community at large (i.e. data currency). Keeping databases current with the most recent scientific results, as well as providing a rich historical background from legacy articles, is a challenging process. To address this issue of data currency, CTD designed and tested a journal-centric approach of curation to complement our chemical-centric method. We first identified priority journals based on defined criteria. Next, over 7 weeks, three biocurators reviewed 2425 articles from three consecutive years (2009-2011) of three targeted journals. From this corpus, 1252 articles contained relevant data for CTD and 52 752 interactions were manually curated. Here, we describe our journal selection process, two methods of document delivery for the biocurators and the analysis of the resulting curation metrics, including data currency, and both intra-journal and inter-journal comparisons of research topics. Based on our results, we expect that curation by select journals can (i) be easily incorporated into the curation pipeline to complement our chemical-centric approach; (ii) build content more evenly for chemicals, genes and diseases in CTD (rather than biasing data by chemicals-of-interest); (iii) reflect developing areas in environmental health and (iv) improve overall data currency for chemicals, genes and diseases. Database URL: http://ctdbase.org/

Assuntos

Mineração de Dados/métodos , Bases de Dados Genéticas , Publicações Periódicas como Assunto , Toxicogenética , Saúde Ambiental , Genes , Humanos , Anotação de Sequência Molecular

Analysis of gene ontology features in microarray data using the Proteome BioKnowledge Library.

Johnson, Robin J; Williams, Jennifer M; Schreiber, Barbara M; Elfe, Charles D; Lennon-Hopkins, Kelley L; Skrzypek, Marek S; White, Renee D.

In Silico Biol ; 5(4): 389-99, 2005.

Artigo em Inglês | MEDLINE | ID: mdl-16268783

RESUMO

Microarray technology has resulted in an explosion of complex, valuable data. Integrating data analysis tools with a comprehensive underlying database would allow efficient identification of common properties among differentially regulated genes. In this study we sought to compare the utility of various databases in microarray analysis. The Proteome BioKnowledge Library (BKL), a manually curated, proteome-wide compilation of the scientific literature, was used to generate a list of Gene Ontology (GO) Biological Process (BP) terms enriched among proteins involved in cardiovascular disease. Analysis of DNA microarray data generated in a study of rat vascular smooth muscle cell responses revealed significant enrichment in a number of GO BPs that were also enriched among cardiovascular disease-related proteins. Using annotation from LocusLink and chip annotation from the Gene Expression Omnibus yielded fewer enriched cardiovascular disease-associated GO BP terms. Data sets of orthologous genes from mouse and human were generated using the BKL Retriever. Analysis of these sets focusing on BKL Disease annotation, revealed a significant association of these genes with cardiovascular disease. These results and the extensive presence of experimental evidence for BKL GO and Disease features, underscore the benefits of using this database for microarray analysis.

Assuntos

Bases de Dados Factuais , Análise de Sequência com Séries de Oligonucleotídeos , Animais , Doenças Cardiovasculares/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Armazenamento e Recuperação da Informação , Camundongos , Proteoma , Ratos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA