Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Med Inform Decis Mak ; 23(1): 85, 2023 05 05.
Article in English | MEDLINE | ID: mdl-37147600

ABSTRACT

BACKGROUND: Epidemiological research may require linkage of information from multiple organizations. This can bring two problems: (1) the information governance desirability of linkage without sharing direct identifiers, and (2) a requirement to link databases without a common person-unique identifier. METHODS: We develop a Bayesian matching technique to solve both. We provide an open-source software implementation capable of de-identified probabilistic matching despite discrepancies, via fuzzy representations and complete mismatches, plus de-identified deterministic matching if required. We validate the technique by testing linkage between multiple medical records systems in a UK National Health Service Trust, examining the effects of decision thresholds on linkage accuracy. We report demographic factors associated with correct linkage. RESULTS: The system supports dates of birth (DOBs), forenames, surnames, three-state gender, and UK postcodes. Fuzzy representations are supported for all except gender, and there is support for additional transformations, such as accent misrepresentation, variation for multi-part surnames, and name re-ordering. Calculated log odds predicted a proband's presence in the sample database with an area under the receiver operating curve of 0.997-0.999 for non-self database comparisons. Log odds were converted to a decision via a consideration threshold θ and a leader advantage threshold δ. Defaults were chosen to penalize misidentification 20-fold versus linkage failure. By default, complete DOB mismatches were disallowed for computational efficiency. At these settings, for non-self database comparisons, the mean probability of a proband being correctly declared to be in the sample was 0.965 (range 0.931-0.994), and the misidentification rate was 0.00249 (range 0.00123-0.00429). Correct linkage was positively associated with male gender, Black or mixed ethnicity, and the presence of diagnostic codes for severe mental illnesses or other mental disorders, and negatively associated with birth year, unknown ethnicity, residential area deprivation, and presence of a pseudopostcode (e.g. indicating homelessness). Accuracy rates would be improved further if person-unique identifiers were also used, as supported by the software. Our two largest databases were linked in 44 min via an interpreted programming language. CONCLUSIONS: Fully de-identified matching with high accuracy is feasible without a person-unique identifier and appropriate software is freely available.


Subject(s)
Medical Record Linkage , Privacy , Humans , Male , Bayes Theorem , State Medicine , Software
2.
Front Psychiatry ; 12: 578298, 2021.
Article in English | MEDLINE | ID: mdl-34867492

ABSTRACT

CamCOPS is a free, open-source client-server system for secure data capture in the domain of psychiatry, psychology, and the clinical neurosciences. The client is a cross-platform C++ application, suitable for mobile and offline (disconnected) use. It allows touchscreen data entry by subjects/patients, researchers/clinicians, or both together. It implements a large and extensible range of tasks, from simple questionnaires to complex animated tasks. The client uses encrypted data storage and sends data via an encrypted network connection to a CamCOPS server. Individual institutional users set up and run their own CamCOPS server, so no data is transferred outside the hosting institution's control. The server, written in Python, provides clinically oriented and research-oriented views of tasks, including the tracking of changes over time. It provides an audit trail, export facilities (such as to an institution's primary electronic health record system), and full structured data access subject to authorization. A single CamCOPS server can support multiple research/clinical groups, each having its own identity policy (e.g., fully identifiable for clinical use; de-identified/pseudonymised for research use). Intellectual property rules regarding third-party tasks vary and CamCOPS has several mechanisms to support compliance, including for tasks that may be permitted to some institutions but not others. CamCOPS supports task scheduling and home testing via a simplified user interface. We describe the software, report local information governance approvals within part of the UK National Health Service, and describe illustrative clinical and research uses.

SELECTION OF CITATIONS
SEARCH DETAIL
...