Search | VHL Regional Portal

Logistic regression over encrypted data from fully homomorphic encryption.

Chen, Hao; Gilad-Bachrach, Ran; Han, Kyoohyung; Huang, Zhicong; Jalali, Amir; Laine, Kim; Lauter, Kristin.

BMC Med Genomics ; 11(Suppl 4): 81, 2018 Oct 11.

Article in English | MEDLINE | ID: mdl-30309350

ABSTRACT

BACKGROUND: One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then homomorphically apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud. METHODS: Our solution to this problem has several novelties: we use a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding; we combine bootstrapping in fully homomorphic encryption with a scaling operation in fixed point arithmetic; we use a minimax polynomial approximation to the sigmoid function and the 1-bit gradient descent method to reduce the plaintext growth in the training process. RESULTS: Our algorithm for training over encrypted data takes 0.4-3.2 hours per iteration of gradient descent. CONCLUSIONS: We demonstrate the feasibility but high computational cost of training over encrypted data. On the other hand, our method can guarantee the highest level of data privacy in critical applications.

Subject(s)

Computer Security , Algorithms , Area Under Curve , Databases as Topic , Genotype , Humans , Logistic Models

Private queries on encrypted genomic data.

Çetin, Gizem S; Chen, Hao; Laine, Kim; Lauter, Kristin; Rindal, Peter; Xia, Yuhou.

BMC Med Genomics ; 10(Suppl 2): 45, 2017 07 26.

Article in English | MEDLINE | ID: mdl-28786359

ABSTRACT

BACKGROUND: One of the tasks in the iDASH Secure Genome Analysis Competition in 2016 was to demonstrate the feasibility of privacy-preserving queries on homomorphically encrypted genomic data. More precisely, given a list of up to 100,000 mutations, the task was to encrypt the data using homomorphic encryption in a way that allows it to be stored securely in the cloud, and enables the data owner to query the dataset for the presence of specific mutations, without revealing any information about the dataset or the queries to the cloud. METHODS: We devise a novel string matching protocol to enable privacy-preserving queries on homomorphically encrypted data. Our protocol combines state-of-the-art techniques from homomorphic encryption and private set intersection protocols to minimize the computational and communication cost. RESULTS: We implemented our protocol using the homomorphic encryption library SEAL v2.1, and applied it to obtain an efficient solution to the iDASH competition task. For example, using 8 threads, our protocol achieves a running time of only 4 s, and a communication cost of 2 MB, when querying for the presence of 5 mutations from an encrypted dataset of 100,000 mutations. CONCLUSIONS: We demonstrate that homomorphic encryption can be used to enable an efficient privacy-preserving mechanism for querying the presence of particular mutations in realistic size datasets. Beyond its applications to genomics, our protocol can just as well be applied to any kind of data, and is therefore of independent interest to the homomorphic encryption community.

Subject(s)

Computer Security , Data Mining/methods , Genomics , Algorithms , Feasibility Studies

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL