Théo Ryffel

The presentation cover couldn't load. :/

Théo Ryffel

Hey, welcome!

At the crossroads of Machine Learning and Cryptography

I had my PhD in Privacy-Preserving Machine Learning at ENS and INRIA under the supervision of David Pointcheval and Francis Bach. I'm also co-founder of Arkhn, a company that helps hospitals regaining sovereignty over their data. I'm an open-source contributor to the OpenMined community which builds practical tools for private ML like PySyft.

Deep Learning Federated Learning Secure Multi-Party Computation Function Secret Sharing Differential Privacy Functional Encryption

Go to my blog Twitter Github Linkedin Google scholar Contact

PhD Defense

Information

The PhD Defense was held at ENS Paris on June 23rd, 2022.
👨‍🏫 Get the slides
📽 Watch the replay
📄 Manuscript

Title

Cryptography for Privacy-Preserving Machine Learning

Jury

Aurélien Bellet - INRIA Lille (Rapporteur)
Yuval Ishai - Technion (Rapporteur)
Renaud Sirdey - CEA (Examinateur)
Mariya Georgieva - Inpher (Examinatrice)
Laurent Massoulié - INRIA, ENS, PSL (Examinateur)
Jonathan Passerat-Palmbach - Imperial College London (Examinateur)
David Pointcheval - ENS, CNRS, PSL (Directeur de thèse)
Francis Bach - INRIA, ENS, CNRS, PSL (Directeur de thèse)

Abstract

The ever growing use of machine learning (ML), motivated by the possibilities it brings to a large number of sectors, is increasingly raising questions because of the sensitive nature of the data that must be used and the lack of transparency on the way these data are collected, combined or shared. Therefore, a number of methods are being developed to reduce its impact on our privacy and make its use more acceptable, especially in areas such as healthcare where its potential is still largely under-exploited.

This thesis explores different methods from the fields of cryptography and security, and applies them to machine learning in order to establish new confidentiality guarantees for the data used and the ML models.

Our first contribution is the development of a technical foundation to facilitate experimentation of new approaches, through an open-source library named PySyft. We propose a modular architecture that allows one to pick the confidentiality blocks necessary for one's study, or to develop and easily integrate new blocks. This library is reused in all the implementations proposed in this thesis.

Our second contribution consists in highlighting the vulnerability of ML models by proposing an attack that exploits a trained model to reveal confidential attributes of an individual. This attack could, for example, subvert a model that recognizes a person's sport from an image, to detect the person's racial origins. We propose solutions to limit the impact of this attack.

In a third step, we focus on some cryptographic protocols that allow us to perform computations on encrypted data. A first study proposes a functional encryption protocol that allows to make predictions using a small ML model over encrypted data and to only make the predictions public. A second study focuses on optimizing a functional secret sharing protocol, which allows an ML model to be trained or evaluated on data privately, i.e. without revealing either the model or the data to anyone. This protocol provides sufficient performance to use models that have practical utility in non-trivial tasks such as pathology detection in lung X-rays.

Our final contribution is in differential privacy, a technique that limits the vulnerability of ML models and thus the exposure of the data used in training by introducing a controlled perturbation. We propose a new protocol and show that it offers the possibility to train a smooth and strongly convex model with a bounded privacy loss regardless of the number of calls to sensitive data during training.

Last talks & events

October 11 & 11, 2022
Lecture on Federated Learning and Privacy-Enhancing Technologies at ISIS Engineering school.
August 30, 2022
Audition for the AI & Health Working Group to define the French National AI Strategy.
July 12, 2022
Presentation of AriaNN: Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing at PoPETs 2022 (Sydney).
June 11, 2022
Talk on privacy-enhancing data sharing across healthcare institutions with european healthcare consortia (internal).
April 20, 2022
Speaker at Podcast La Galère for Start The F Up: challenges of data management in hospitals.
November 30, 2021
Presentation at the Quatre Vents conference on the topic "AI in health: myths and realities".
September 15, 2021
Presentation of the current challenges in privacy preserving machine learning at ENS Paris (internal).
September 13, 2021
Talk on interactions between Function Secret Sharing and Differential Privacy at Singapore University.
June 16, 2021
Introduction to the Hackathon Cybersecurity: Protecting applications.
June 4, 2021
Talk on Privacy-Preserving Machine Learning at CNIL.
March 18, 2021
Talk on interactions between Differential Privacy and Multi-Party Computation at ENS Paris (internal).
March 16, 2021
Presentation of new tools for automated text analysis and structuration at AP-HP.
February 17, 2021
Presentation about data architectures in healthcare facilities and privacy enhancing techniques, at Roche.
February 17, 2021
Podcast: "Recruiting a technical team with expertise in data and healthcare". Listen the podcast
February 11, 2021
Presentation of new Function Secret Sharing techniques at the ANBLIC Consortium meeting.
November 20, 2020
Talk about building a multi-Purpose stack using FHIR as a persistence layer, FHIR Dev Days 2020.
September 26, 2020
Talk at the OpenMined Privacy Conference about concrete applications of privacy in healthcare.
July 15, 2020
Final presentation of CrypTen integration in PySyft with Facebook Research.
July 8, 2020
Presentation of privacy-preserving demos at Paris OpenMined Meetup
June 19, 2020
Talk on Federated Analytics on Real-life Healthcare Data at the Federated Learning Conference
December 10, 2019
Poster presentation at NeurIPS 2019, Partially Encrypted Machine Learning using Functional Encryption (Canada)
November 29, 2019
Semantic Web applied to healthcare, presentation at Datathon Archives Nationales
November 12, 2019
Talk on privacy-enhancing techniques in healthcare at XMP-Biotech
October 15, 2019
Keynote on Data Anonymization at the BNP Paribas - Plug And Play Deep Dive
July 9, 2019
Talk at APVP 2019 on PySyft
June 26, 2019
Presentation of OpenMined at CHUV (Switzerland)
June 19, 2019
Talk at School of Cybersecurity - Université Côte d'Azur on OpenMined
June 17, 2019
Presentation of Privacy-Preserving ML techniques at LIMICS
June 6, 2019
Presentation "Tools for Safe AI" at Laboratoire de Sciences Cognitives et Psycholinguistiques (BabyCloud team)
Mai 15, 2019
Presentation of Federated Learning Techniques at ENS Paris (internal)
Mai 14, 2019
Talk at Paris Meetup OpenMined on Secure & Federated Learning at Arkhn
April 11, 2019
Co-organisation of Paris Meetup OpenMined to federate contributtions on PySyft
March 26, 2019
Presentation of Adversarial Training techniques at the ANBLIC meeting
March 20, 2019
Talk at Paris Meetup OpenMined to present PySyft
December 8, 2018
Talk at Spotlight session NeurIPS 2018 Workshop on Privacy-Preserving Machine Learning (Canada)
November 27, 2018
Presentation of OpenMined at Brave (London)
September 14, 2018
Master's Degree Defense at Imperial College London on PySyft (London)

Research

Differential Privacy Guarantees for Stochastic Gradient Langevin Dynamics
arXiv 2201.11980
Théo Ryffel, Francis Bach, David Pointcheval
github.com/LaRiffle/langevin
ARIANN: Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing
PETS 2022
Théo Ryffel, Pierre Tholoniat, David Pointcheval, Francis Bach
github.com/OpenMined/Pysyft github.com/OpenMined/sycret
End-to-end privacy preserving deep learning on multi-institutional medical imaging
Nature Machine Intelligence, May 2021
Georgios Kaissis, Alexander Ziller, Jonathan Passerat-Palmbach, Théo Ryffel, Dmitrii Usynin, Andrew Trask, Ionésio Lima Jr, Jason Mancuso, Friederike Jungmann, Marc-Matthias Steinborn, Andreas Saleh, Marcus Makowski, Daniel Rueckert & Rickmer Braren
Syft 0.5: A Platform for Universally Deployable Structured Transparency
ICLR DPML 2021
Adam James Hall, Madhava Jay, Tudor Cebere, Théo Ryffel, Andrew Trask et al.
github.com/OpenMined/PySyft
La nouvelle technologie de protection des données
Magazine La Recherche, Mars 2021
Théo Ryffel
Privacy-preserving medical image analysis
Med-NeurIPS 2020
Alexander Ziller, Jonathan Passerat-Palmbach, Théo Ryffel, Dmitrii Usynin, Andrew Trask, Ionésio Da Lima Costa Junior, Jason Mancuso, Marcus Makowski, Daniel Rueckert, Rickmer Braren, Georgios Kaissis
github.com/gkaissis/PriMIA
Toward trustworthy AI development: mechanisms for supporting verifiable claims
Preprint 2020
Miles Brundage, Shahar Avin, Jasmine Wang et al.
Partially Encrypted Machine Learning using Functional Encryption
NeurIPS 2019
Théo Ryffel, Edouard Dufour-Sans, Romain Gay, Francis Bach, David Pointcheval
github.com/LaRiffle/collateral-learning
A Generic Framework for Privacy Preserving Deep Learning
NeurIPS 2018 Workshop on Privacy-Preserving Machine Learning
Théo Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, Jonathan Passerat-Palmbach
github.com/OpenMined/PySyft

Twitter

Github

Most of my current projects are on Github

See Github account

More info on my academic background and past experiences is available on Linkedin

See Linkedin account

You can also Discover my Instagram Buy vinyls