RÉSUMÉ
OBJECTIVES: The purpose of the study was to evaluate content coverage and data quality of the Clinical Data Dictionary (CiDD) developed by the Center for Interoperable EHR (CiEHR). METHODS: A total of 12,994 terms were collected from 98 clinical forms of a tertiary cancer center hospital with 500 beds. After data cleaning, 9,418 terms were mapped with the data items of the CiDD by the research team, and validated by 30 doctors and nurses at the research hospital. RESULTS: Mapping results were classified into five categories: lexically mapped; semantically mapped; mapped to either a broader term or a narrower term; mapped to more than one term and not mapped. In terms of coverage, out of 9,418 terms, 6,750 (71.7%) terms were mapped; 4,319 (45.9%) terms were lexically mapped; 2,431 (25.8%) were semantically mapped; 281 (3.0%) terms were mapped to a broader term; 43 (0.5%) were mapped to a narrower term; and 550 (5.8%) were mapped to more than one term. In terms of data quality, the CiDD has problems such as errors in concept namingand representation, redundancy in synonyms, inadequate synonyms, and ambiguity in meaning. CONCLUSIONS: Although the CiDD has terms covering 72% of local clinical terms, the CiDD can be improved by cleaning up errors and redundancies, adding textual definitions or use cases of the concept, and arranging the concepts in a hierarchy.