The presentation cover couldn't load. :/
Profile picture

Théo Ryffel

Generic placeholder image Generic placeholder image Generic placeholder image

Hey, welcome!


At the crossroads of Machine Learning and Cryptography

I had my PhD in Privacy-Preserving Machine Learning at ENS and INRIA under the supervision of David Pointcheval and Francis Bach. I'm also co-founder of Arkhn, a company that helps hospitals regaining sovereignty over their data. I'm an open-source contributor to the OpenMined community which builds practical tools for private ML like PySyft.

Deep Learning Federated Learning Secure Multi-Party Computation Function Secret Sharing Differential Privacy Functional Encryption

Go to my blog Twitter Github Linkedin Google scholar Contact

PhD Defense
Information

The PhD Defense was held at ENS Paris on June 23rd, 2022.
👨‍🏫  Get the slides
📽  Watch the replay
📄  Manuscript

Title

Cryptography for Privacy-Preserving Machine Learning

Jury

  • Aurélien Bellet - INRIA Lille (Rapporteur)
  • Yuval Ishai - Technion (Rapporteur)
  • Renaud Sirdey - CEA (Examinateur)
  • Mariya Georgieva - Inpher (Examinatrice)
  • Laurent Massoulié - INRIA, ENS, PSL (Examinateur)
  • Jonathan Passerat-Palmbach - Imperial College London (Examinateur)
  • David Pointcheval - ENS, CNRS, PSL (Directeur de thèse)
  • Francis Bach - INRIA, ENS, CNRS, PSL (Directeur de thèse)

Abstract

The ever growing use of machine learning (ML), motivated by the possibilities it brings to a large number of sectors, is increasingly raising questions because of the sensitive nature of the data that must be used and the lack of transparency on the way these data are collected, combined or shared. Therefore, a number of methods are being developed to reduce its impact on our privacy and make its use more acceptable, especially in areas such as healthcare where its potential is still largely under-exploited.

This thesis explores different methods from the fields of cryptography and security, and applies them to machine learning in order to establish new confidentiality guarantees for the data used and the ML models.

Our first contribution is the development of a technical foundation to facilitate experimentation of new approaches, through an open-source library named PySyft. We propose a modular architecture that allows one to pick the confidentiality blocks necessary for one's study, or to develop and easily integrate new blocks. This library is reused in all the implementations proposed in this thesis.

Our second contribution consists in highlighting the vulnerability of ML models by proposing an attack that exploits a trained model to reveal confidential attributes of an individual. This attack could, for example, subvert a model that recognizes a person's sport from an image, to detect the person's racial origins. We propose solutions to limit the impact of this attack.

In a third step, we focus on some cryptographic protocols that allow us to perform computations on encrypted data. A first study proposes a functional encryption protocol that allows to make predictions using a small ML model over encrypted data and to only make the predictions public. A second study focuses on optimizing a functional secret sharing protocol, which allows an ML model to be trained or evaluated on data privately, i.e. without revealing either the model or the data to anyone. This protocol provides sufficient performance to use models that have practical utility in non-trivial tasks such as pathology detection in lung X-rays.

Our final contribution is in differential privacy, a technique that limits the vulnerability of ML models and thus the exposure of the data used in training by introducing a controlled perturbation. We propose a new protocol and show that it offers the possibility to train a smooth and strongly convex model with a bounded privacy loss regardless of the number of calls to sensitive data during training.


Last talks & events
Research

Generic placeholder image
Twitter

Latest news always are on Twitter

See Twitter account

Generic placeholder image
Github

Most of my current projects are on Github

See Github account

Generic placeholder image
Linkedin

More info on my academic background and past experiences is available on Linkedin

See Linkedin account