RESUMEN
Large datasets in paediatric oncology are inherently rare. Therefore, it is paramount to fully exploit all available data, which are distributed over several resources, including biomaterials, images, clinical trials, and registries. With privacy-preserving record linkage (PPRL), personalised or pseudonymised datasets can be merged, without disclosing the patients' identities. Although PPRL is implemented in various settings, use case descriptions are currently fragmented and incomplete. The present paper provides a comprehensive overview of current and future use cases for PPRL in paediatric oncology. We analysed the literature, projects, and trial protocols, identified use cases along a hypothetical patient journey, and discussed use cases with paediatric oncology experts. To structure PPRL use cases, we defined six key dimensions: distributed personalised records, pseudonymisation, distributed pseudonymised records, record linkage, linked data, and data analysis. Selected use cases were described (a) per dimension and (b) on a multi-dimensional level. While focusing on paediatric oncology, most aspects are also applicable to other (particularly rare) diseases. We conclude that PPRL is a key concept in paediatric oncology. Therefore, PPRL strategies should already be considered when starting research projects, to avoid distributed data silos, to maximise the knowledge derived from collected data, and, ultimately, to improve outcomes for children with cancer.
RESUMEN
Secondary use of data for research purposes is especially important in rare diseases (RD), since, per definition, data are sparse. The European Joint Programme on Rare Diseases (EJP RD) aims at developing an RD infrastructure which supports the secondary use of data. Significant amounts of RD data are a) distributed and b) available only in pseudonymised format. Privacy-Preserving Record Linkage (PPRL) concerns the linking of such distributed datasets without disclosing the participant's identities. We present a concept for linking a PPRL Service to the EJP RD Virtual Platform (VP). Level 1 (resource discovery) connection is provided by running an FDP within the PPRL Service. On Level 2 (data discoverability), the PPRL Service can represent both, an individual and a catalog endpoint. Our solution can count patients in PPRL-supporting resources, count duplicates only once, and count only patients registered to multiple resources. Currently, we are preparing the deployment within the EJP RD VP.