Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
J Priv Confid ; 12(2)2022 Nov 02.
Article in English | MEDLINE | ID: mdl-37860129

ABSTRACT

The stochastic block model (SBM) and degree-corrected block model (DCBM) are network models often selected as the fundamental setting in which to analyze the theoretical properties of community detection methods. We consider the problem of spectral clustering of SBM and DCBM networks under a local form of edge differential privacy. Using a randomized response privacy mechanism called the edge-flip mechanism, we develop theoretical guarantees for differentially private community detection, demonstrating conditions under which this strong privacy guarantee can be upheld while achieving spectral clustering convergence rates that match the known rates without privacy. We prove the strongest theoretical results are achievable for dense networks (those with node degree linear in the number of nodes), while weak consistency is achievable under mild sparsity (node degree greater than n). We empirically demonstrate our results on a number of network examples.

2.
JMIR Med Inform ; 9(4): e21459, 2021 Apr 23.
Article in English | MEDLINE | ID: mdl-33890866

ABSTRACT

BACKGROUND: In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression-a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information-with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. OBJECTIVE: The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. METHODS: We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. RESULTS: PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. CONCLUSIONS: PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings.

3.
AIDS ; 31 Suppl 1: S87-S94, 2017 04.
Article in English | MEDLINE | ID: mdl-28296804

ABSTRACT

OBJECTIVE: HIV prevalence data collected from routine HIV testing of pregnant women at antenatal clinics (ANC-RT) are potentially available from all facilities that offer testing services to pregnant women and can be used to improve estimates of national and subnational HIV prevalence trends. We develop methods to incorporate these new data source into the Joint United Nations Programme on AIDS Estimation and Projection Package in Spectrum 2017. METHODS: We develop a new statistical model for incorporating ANC-RT HIV prevalence data, aggregated either to the health facility level (site-level) or regionally (census-level), to estimate HIV prevalence alongside existing sources of HIV prevalence data from ANC unlinked anonymous testing (ANC-UAT) and household-based national population surveys. Synthetic data are generated to understand how the availability of ANC-RT data affects the accuracy of various parameter estimates. RESULTS: We estimate HIV prevalence and additional parameters using both ANC-RT and other existing data. Fitting HIV prevalence using synthetic data generally gives precise estimates of the underlying trend and other parameters. More years of ANC-RT data should improve prevalence estimates. More ANC-RT sites and continuation with existing ANC-UAT sites may improve the estimate of calibration between ANC-UAT and ANC-RT sites. CONCLUSION: We have proposed methods to incorporate ANC-RT data into Spectrum to obtain more precise estimates of prevalence and other measures of the epidemic. Many assumptions about the accuracy, consistency, and representativeness of ANC-RT prevalence underlie the use of these data for monitoring HIV epidemic trends and should be tested as more data become available from national ANC-RT programs.


Subject(s)
HIV Infections/epidemiology , Models, Statistical , Pregnancy Complications, Infectious/epidemiology , Adolescent , Adult , Epidemics , Female , HIV Infections/diagnosis , Humans , Middle Aged , Pregnancy , Pregnancy Complications, Infectious/diagnosis , Prenatal Care , Prevalence , Young Adult
5.
J Biomed Inform ; 50: 133-41, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24509073

ABSTRACT

The protection of privacy of individual-level information in genome-wide association study (GWAS) databases has been a major concern of researchers following the publication of "an attack" on GWAS data by Homer et al. (2008). Traditional statistical methods for confidentiality and privacy protection of statistical databases do not scale well to deal with GWAS data, especially in terms of guarantees regarding protection from linkage to external information. The more recent concept of differential privacy, introduced by the cryptographic community, is an approach that provides a rigorous definition of privacy with meaningful privacy guarantees in the presence of arbitrary external information, although the guarantees may come at a serious price in terms of data utility. Building on such notions, Uhler et al. (2013) proposed new methods to release aggregate GWAS data without compromising an individual's privacy. We extend the methods developed in Uhler et al. (2013) for releasing differentially-private χ(2)-statistics by allowing for arbitrary number of cases and controls, and for releasing differentially-private allelic test statistics. We also provide a new interpretation by assuming the controls' data are known, which is a realistic assumption because some GWAS use publicly available data as controls. We assess the performance of the proposed methods through a risk-utility analysis on a real data set consisting of DNA samples collected by the Wellcome Trust Case Control Consortium and compare the methods with the differentially-private release mechanism proposed by Johnson and Shmatikov (2013).


Subject(s)
Genome-Wide Association Study , Privacy , Crohn Disease/genetics , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...