Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Med Inform Decis Mak ; 20(Suppl 10): 273, 2020 12 15.
Article in English | MEDLINE | ID: mdl-33319703

ABSTRACT

BACKGROUND: The National Cancer Institute (NCI) Thesaurus provides reference terminology for NCI and other systems. Previously, we proposed a hybrid prototype utilizing lexical features and role definitions of concepts in non-lattice subgraphs to identify missing IS-A relations in the NCI Thesaurus. However, no domain expert evaluation was provided in our previous work. In this paper, we further enhance the hybrid approach by leveraging a novel lexical feature-roots of noun chunks within concept names. Formal evaluation of our enhanced approach is also performed. METHOD: We first compute all the non-lattice subgraphs in the NCI Thesaurus. We model each concept using its role definitions, words and roots of noun chunks within its concept name and its ancestor's names. Then we perform subsumption testing for candidate concept pairs in the non-lattice subgraphs to automatically detect potentially missing IS-A relations. Domain experts evaluated the validity of these relations. RESULTS: We applied our approach to 19.08d version of the NCI Thesaurus. A total of 55 potentially missing IS-A relations were identified by our approach and reviewed by domain experts. 29 out of 55 were confirmed as valid by domain experts and have been incorporated in the newer versions of the NCI Thesaurus. 7 out of 55 further revealed incorrect existing IS-A relations in the NCI Thesaurus. CONCLUSIONS: The results showed that our hybrid approach by leveraging lexical features and role definitions is effective in identifying potentially missing IS-A relations in the NCI Thesaurus.


Subject(s)
Vocabulary, Controlled , Humans , National Cancer Institute (U.S.) , United States
2.
JCO Clin Cancer Inform ; 4: 392-398, 2020 05.
Article in English | MEDLINE | ID: mdl-32374632

ABSTRACT

PURPOSE: To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS: We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybrid auditing method consisted of three main steps: computing nonlattice subgraphs, constructing lexical features for concepts in each subgraph, and performing subsumption reasoning with each subgraph to automatically suggest potentially missing is-a relations. RESULTS: A total of 9,512 nonlattice subgraphs were obtained. Our method identified 925 potentially missing is-a relations in 441 nonlattice subgraphs; 72 of 176 reviewed samples were confirmed as valid missing is-a relations and have been incorporated in the newer versions of the NCI Thesaurus. CONCLUSION: Autosuggested changes resulting from our auditing method can improve the structural organization of the NCI Thesaurus in supporting its new role for faceted query.


Subject(s)
Neoplasms , Vocabulary, Controlled , Humans , National Cancer Institute (U.S.) , Neoplasms/epidemiology , Registries , United States
3.
Database (Oxford) ; 2012: bar066, 2012.
Article in English | MEDLINE | ID: mdl-22434834

ABSTRACT

The overall objective of the Mouse-Human Anatomy Project (MHAP) was to facilitate the mapping and harmonization of anatomical terms used for mouse and human models by Mouse Genome Informatics (MGI) and the National Cancer Institute (NCI). The anatomy resources designated for this study were the Adult Mouse Anatomy (MA) ontology and the set of anatomy concepts contained in the NCI Thesaurus (NCIt). Several methods and software tools were identified and evaluated, then used to conduct an in-depth comparative analysis of the anatomy ontologies. Matches between mouse and human anatomy terms were determined and validated, resulting in a highly curated set of mappings between the two ontologies that has been used by other resources. These mappings will enable linking of data from mouse and human. As the anatomy ontologies have been expanded and refined, the mappings have been updated accordingly. Insights are presented into the overall process of comparing and mapping between ontologies, which may prove useful for further comparative analyses and ontology mapping efforts, especially those involving anatomy ontologies. Finally, issues concerning further development of the ontologies, updates to the mapping files, and possible additional applications and significance were considered. DATABASE URL: http://obofoundry.org/cgi-bin/detail.cgi?id=ma2ncit.


Subject(s)
Anatomy/methods , Databases, Factual , Vocabulary, Controlled , Animals , Genomics , Humans , Mice , Reproducibility of Results
4.
J Biomed Inform ; 40(1): 30-43, 2007 Feb.
Article in English | MEDLINE | ID: mdl-16697710

ABSTRACT

Over the last 8 years, the National Cancer Institute (NCI) has launched a major effort to integrate molecular and clinical cancer-related information within a unified biomedical informatics framework, with controlled terminology as its foundational layer. The NCI Thesaurus is the reference terminology underpinning these efforts. It is designed to meet the growing need for accurate, comprehensive, and shared terminology, covering topics including: cancers, findings, drugs, therapies, anatomy, genes, pathways, cellular and subcellular processes, proteins, and experimental organisms. The NCI Thesaurus provides a partial model of how these things relate to each other, responding to actual user needs and implemented in a deductive logic framework that can help maintain the integrity and extend the informational power of what is provided. This paper presents the semantic model for cancer diseases and its uses in integrating clinical and molecular knowledge, more briefly examines the models and uses for drug, biochemical pathway, and mouse terminology, and discusses limits of the current approach and directions for future work.


Subject(s)
Biomedical Research/methods , Database Management Systems , Databases, Factual , Information Storage and Retrieval/methods , Neoplasms/classification , Neoplasms/physiopathology , Vocabulary, Controlled , Computational Biology/methods , Humans , National Institutes of Health (U.S.) , Neoplasm Proteins/metabolism , Semantics , Systems Integration , United States , User-Computer Interface
5.
BMC Med Inform Decis Mak ; 6: 25, 2006 Jun 20.
Article in English | MEDLINE | ID: mdl-16787533

ABSTRACT

BACKGROUND: The Cancer Biomedical Informatics Grid (caBIG) is a network of individuals and institutions, creating a world wide web of cancer research. An important aspect of this informatics effort is the development of consistent practices for data standards development, using a multi-tier approach that facilitates semantic interoperability of systems. The semantic tiers include (1) information models, (2) common data elements, and (3) controlled terminologies and ontologies. The College of American Pathologists (CAP) cancer protocols and checklists are an important reporting standard in pathology, for which no complete electronic data standard is currently available. METHODS: In this manuscript, we provide a case study of Cancer Common Ontologic Representation Environment (caCORE) data standard implementation of the CAP cancer protocols and checklists model--an existing and complex paper based standard. We illustrate the basic principles, goals and methodology for developing caBIG models. RESULTS: Using this example, we describe the process required to develop the model, the technologies and data standards on which the process and models are based, and the results of the modeling effort. We address difficulties we encountered and modifications to caCORE that will address these problems. In addition, we describe four ongoing development projects that will use the emerging CAP data standards to achieve integration of tissue banking and laboratory information systems. CONCLUSION: The CAP cancer checklists can be used as the basis for an electronic data standard in pathology using the caBIG semantic modeling methodology.


Subject(s)
Database Management Systems , Internet , Medical Informatics , Medical Oncology/standards , Neoplasms/pathology , Pathology, Clinical/standards , Clinical Protocols , Humans , National Institutes of Health (U.S.) , Natural Language Processing , Neoplasms/classification , Semantics , Systems Integration , United States , User-Computer Interface , Vocabulary, Controlled
6.
Stud Health Technol Inform ; 107(Pt 1): 33-7, 2004.
Article in English | MEDLINE | ID: mdl-15360769

ABSTRACT

Cancer researchers need to be able to organize and report their results in a way that others can find, build upon, and relate to the specific clinical conditions of individual patients. NCI Thesaurus is a description logic terminology based on current science that helps individuals and software applications connect and organize the results of cancer research, e.g., by disease and underlying biology. Currently containing some 34,000 concepts--covering chemicals, drugs and other therapies, diseases, genes and gene products, anatomy, organisms, animal models, techniques, biologic processes, and administrative categories--NCI Thesaurus serves applications and the Web from a terminology server. As a scalable, formal terminology, the deployed Thesaurus, and associated applications and interfaces, are a model for some of the standards required for the NHII (National Health Information Infrastructure) and the Semantic Web.


Subject(s)
Neoplasms , Terminology as Topic , Computer Systems , Humans , Medical Oncology , National Institutes of Health (U.S.) , United States , Vocabulary, Controlled
SELECTION OF CITATIONS
SEARCH DETAIL
...