Secure and Private Machine Learning

Subhankar Mishra

School of Computer Sciences
NISER

Subhankar Mishra's LAB, NISER

Big Data Era

  • A lot fo data is collected.
    • By companies like Amazon, Facebook, Apple, Google etc.
    • By government
  • Data collected is highly personal and sensitive
    • Browsing history, purchase history, speech, geolocation, health etc.
Subhankar Mishra's LAB, NISER

Risks of exposure?

  • Credentials (credit cards, passwords)
    • steal personal property
  • Identification (name, biometric data)
    • Identity Theft
  • information about you (medical status)
    • discrimination
Subhankar Mishra's LAB, NISER

Privacy and Utility

  • Increased privacy affects utility
  • Take measures not to remove utility
  • Find better trade-offs between privacy and utility
Subhankar Mishra's LAB, NISER

Privacy vs Security

  • Privacy - How is your data handled? Your right wrt your personal information.
  • Security - Protection against unauthorized access

Examples

  • Your bank data is shared with an ad agency by the bank. - Privacy/Security breach?
  • Entire bank data is leaked on web - Privacy/Security breach?
Subhankar Mishra's LAB, NISER

Example

Name Had Chocolate?
A Yes
B No
C Yes
D No

If this data is leaked, assume it will cause havoc :)
We want to know how many people had chocolate?

Subhankar Mishra's LAB, NISER

What do we do?

Strategy 1

yi={xiwith probability 11xiwith probability 0 y_i = \begin{cases} x_i & \text{with probability 1}\\ 1 - x_i & \text{with probability 0} \end{cases}

Perfectly accurate, no privacy.

Subhankar Mishra's LAB, NISER

What do we do?

Strategy 2

yi={xiwith probability 1/21xiwith probability 1/2 y_i = \begin{cases} x_i & \text{with probability 1/2}\\ 1 - x_i & \text{with probability 1/2} \end{cases}

Perfectly private, no accuracy.

Subhankar Mishra's LAB, NISER

What do we do?

Strategy 3

yi={xiwith probability 1/2 + γ1xiwith probability 1/2 - γ y_i = \begin{cases} x_i & \text{with probability 1/2 + } \gamma\\ 1 - x_i & \text{with probability 1/2 - } \gamma \end{cases}

γ[0,1/2]\gamma \in [0,1/2]

Also called Randomized Response

Subhankar Mishra's LAB, NISER

What do we do?

Strategy 3 - Randomized Response

Secure each row

  • Flip a coin
  • If heads
    • Flip a second coin
      • Yes if heads
      • No if tails
  • If tails, dont change the value
Subhankar Mishra's LAB, NISER

Example

Name Had Chocolate?
A Yes
B No
C Yes
D No

Now, data is 50% real and 50% random. We do not know which row is real or random.

This gives the individuals "plausible deniability".

Subhankar Mishra's LAB, NISER

Analysis?

  • We want to know how many people had chocolate?
  • Assume p to be ground truth of the fraction of people who had chocolate.
  • number of yes answers =

14(1p)+34p\frac{1}{4} (1-p) + \frac{3}{4}p

Subhankar Mishra's LAB, NISER

Differential Privacy

Formal definition

A randomized computation M satisfies ϵ\epsilon-differential privacy if for any adjacent data sets xx and xx', and any subset C of possible outcomes Range(M)

Pr[M(x)C]exp(ϵ)×Pr[M(x)C]Pr[M(x) \in C] \leq exp(\epsilon) \times Pr[M(x') \in C]

  • As we remove rows, the answer distribution remains same with a factor of eϵe^\epsilon
  • Smaller the ϵ\epsilon, better the utility. But more data is also leaked.

Dwork, Cynthia, and Aaron Roth. "The algorithmic foundations of differential privacy." Foundations and Trends in Theoretical Computer Science 9.3-4 (2014): 211-407.

Subhankar Mishra's LAB, NISER

Idea behind Differential Privacy

  • Lets say, there are two databases D1 and D2 are neighbouring if they agree except for a single entry.

  • The mechanism behaves nearly identically for D1 and D2, then an attacker can’t tell whether D1 or D2 was used (and hence can’t learn much about the individual).

width:700px right

Subhankar Mishra's LAB, NISER

Alternative to differential privacy?

  • Withhold sensitive information?
    • Then we can figure out if sensitive information is present or not!
  • k-Anonymity
    • Does not work for high dimensional data
    • Prone to linkage attacks
    • Destroys utility

Narayanan, Arvind, and Vitaly Shmatikov. "How to break anonymity of the netflix prize dataset." arXiv preprint cs/0610105 (2006).

Subhankar Mishra's LAB, NISER

Differential Privacy in the world

  • US Census Bureau
    Abowd, John M. "The US Census Bureau adopts differential privacy." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018.
  • Apple
    Cormode, Graham, et al. "Privacy at scale: Local differential privacy in practice." Proceedings of the 2018 International Conference on Management of Data. 2018.
  • Microsoft
    Ding, Bolin, Janardhan Kulkarni, and Sergey Yekhanin. "Collecting telemetry data privately." Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017
Subhankar Mishra's LAB, NISER

Let's design a ML system with sensitive user data

Subhankar Mishra's LAB, NISER

Sensitive Data

Anything that must be protected against unauthorised access.

Subhankar Mishra's LAB, NISER

Privacy Nightmare

  • Build an app that collects user data
  • Use data as we want
  • But the data is personal.
Subhankar Mishra's LAB, NISER

However, we need the data!

  • Perfomance of the model depends on the data
Subhankar Mishra's LAB, NISER

Best source of data?

Devices that we use everyday

Subhankar Mishra's LAB, NISER

So what if the data never leaves your device?

How would we train our ML model then?

Subhankar Mishra's LAB, NISER

Federated Learning (FL)!

  • Train a centralised model
  • On decentralised data

Federated Learning. https://federated.withgoogle.com. Accessed on 1st November, 2020

Subhankar Mishra's LAB, NISER

FL - Data stays on device

  • Data stays on the device
  • Design a smart way to train the central model
Subhankar Mishra's LAB, NISER

FL - Bring the training to the device

  • Idea is to bring the training to the device rather than
  • Bringing data from user to the central server
Subhankar Mishra's LAB, NISER

But Training is resource heavy

  • Battery
  • Computing Power
Subhankar Mishra's LAB, NISER

Select the devices that are

  • on charge
  • on Wi-fi
  • idle
Subhankar Mishra's LAB, NISER

Steps

Step 1 - Select a subset of eligible devices

Step 2 - Each device receives a training model

Subhankar Mishra's LAB, NISER

Step 3 - Train on the device

  • Because the data on the device itself is small.
  • Training time is low
Subhankar Mishra's LAB, NISER

Step 4 - Send the trained model back

  • Once the model is trained
  • Model is sent back to the server
  • Not the data
Subhankar Mishra's LAB, NISER

At Center - FedAvg

f(w)=k=1KnknFk(w)f(w) = \sum_{k=1}^K \frac{n_k}{n}F_k(w)
where

  • Fk(w)=1nkiPkfi(w)F_k(w) = \frac{1}{n_k} \sum_{i \in P_k} f_i(w)
  • fi(w)=l(xi,yi,w)f_i(w) = l(x_i, y_i, w) loss of the prediction on example (xi,yi)(x_i, y_i)
  • KK clients, with PkP_k data points on client kk, with nk=Pkn_k = |P_k|

H. B. McMahan, E. Moore, D. Ramage, S. Hampson and B. A. y Arcas, ”Communication-Efficient Learning of Deep Networks from Decentralized Data”, arXiv preprint arXiv:1602.05629 , 2017.

Subhankar Mishra's LAB, NISER

So we good then?

Not exactly

Subhankar Mishra's LAB, NISER

Problem 1: Model inversion attacks!

  • There are attachts that reconstructs the data from the model itself
  • Goal would be to encrypt the model itself

Chen, Si, Ruoxi Jia, and Guo-Jun Qi. "Improved Techniques for Model Inversion Attack." arXiv preprint arXiv:2010.04092 (2020).

Subhankar Mishra's LAB, NISER

Secure Aggregation

  • Model is encrypted
  • With a key that the server does not have

Zhang, Yuheng, et al. "The secret revealer: generative model-inversion attacks against deep neural networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.

Subhankar Mishra's LAB, NISER

Secure Aggregation

  • Every device applies the secure aggregration protocol
  • Adds zero-sum masks to scramble the trained model
  • Adding all the training results
    • all the masks cancel out.

Bonawitz, Keith, et al. "Practical secure aggregation for privacy-preserving machine learning." Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017.

Subhankar Mishra's LAB, NISER

Problem 2: What if a model memorizes data?

  • What if there is rare data?
  • Then model memorizes the data
Subhankar Mishra's LAB, NISER

Differential Privacy

  • Adding noise to the data (Differential Privacy)
  • Even if the model remembers, it remembers noisy data only.

Carlini, Nicholas, et al. "The secret sharer: Evaluating and testing unintended memorization in neural networks." 28th {USENIX} Security Symposium ({USENIX} Security 19). 2019.

https://secml.github.io/

Subhankar Mishra's LAB, NISER

Problem 3: How to test?

  • Now that the model is trained, how do we test?
  • There is no data at the central server
  • All the data resides on the users' devices
Subhankar Mishra's LAB, NISER

Test at the devices !

  • We can just test at the users' devices
  • Some devices can be training
  • Some devices can be testing
Subhankar Mishra's LAB, NISER

Summary

  • Learn from all individual
  • Without learning about any individual
Subhankar Mishra's LAB, NISER

Work from our lab

  1. Sudipta Paul, Poushali Sengupta, and Subhankar Mishra. Flaps: Federated learning and privately scaling.arXivpreprint arXiv:2009.06005, 2020
  2. Poushali Sengupta, Sudipta Paul, and Subhankar Mishra. Buds: Balancing utility and differential privacy byshuffling.11th ICCCNT, 2020. (Accepted)
  3. Poushali Sengupta, Sudipta Paul, and Subhankar Mishra. Learning with differential privacy.Handbook of Researchon Cyber Crime and Information Privacy, 2021
  4. Sudipta Paul and Subhankar Mishra. Ara: Aggregated rappor and analysis for centralized differential privacy. SN Computer Science, 1(1):22, 2020
Subhankar Mishra's LAB, NISER

Thank you

Subhankar Mishra's LAB, NISER

- Allow only group queries? - How many are happy? How many other than Ram is happy?

![bg left](start.png)

![bg right](perf.png)

![bg left](fed.png)

![bg right](data.png)

![bg left](train.png)

![bg right](device.png)

![bg left](fed-step1.png)

![bg right](fed-step2.png)

$min_{w \in R^d} f(w)$, where $F_k(w) = \frac{1}{n_k}\sum_{u \in P_k} f_i(w)$ $

![bg left](secagg1.png)

![bg right](secagg2.png)

![bg left](secagg3.png)

![bg right](dp1.png)

![bg left](dp2.png)

![bg right](test1.png)

![bg left](traintest.png)

![bg right](summary.png)

--- # Is that all? --- # Fairness ## To understand fairness, we need to understand discrimination: - Disparate treatment: Intentionally treating an individual differently based on his/her membership in a particular class. - Disparate impact: Negatively affecting members of a class than others by a policy which looked neutral. *Pessach, Dana, and Erez Shmueli. "Algorithmic fairness." arXiv preprint arXiv:2001.09784 (2020).* --- # Sources of bias/discrimination ## Some sources: - Data (Sensitive attributes - race, gender, age etc) - Biased data - Missing data - Incorrect data - ---