Summary
Statistical machine learning is a growing discipline at the intersection
of computer science and applied mathematics (probability / statistics,
optimization, etc.) and which increasingly plays an important role in
technological innovation.
Unlike a course on traditional statistics, statistical machine learning is
particularly focused on the analysis of data in high dimension, as well as
the efficiency of algorithms to process the large amount of data
encountered in multiple application areas such as image or sound analysis,
natural language processing, bioinformatics or finance.
The objective of this class is to present the main theories and algorithms
in statistical machine learning. The methods covered will rely amongst
others on convex analysis arguments. The practical sessions (more than
half of which will be realized with computers) will lead to simple
implementations of the algorithms seen in class and with applications to
various domains such as computer vision or natural language processing.
Prerequisite: probability theory (notion of random variables,
convergence of random variables, conditional expectation), coding skills
in python.
General information
This class is part of the Computer science courses taught at ENS in L3 in
Spring 20202021.
Teachers:
Alessandro Rudi
and
Francis Bach.
Practical sessions:
Raphaël
Berthier.
The class will last 52 hours (30 hours of class + 22 hours of practical
sessions) and
can
be validated for 12 ECTS.
Final rade: approximately 50% final exam, 50% homework.
Previous years:
Spring
2020,
Spring
2019,
Fall 2018,
2017,
2016,
2015,
2014,
2013,
2012
Schedule and lecture notes
Thursday mornings from 8h30 to 12h15
online on ZOOM. Typical
session will be a lecture from 8h30 to 10h20, followed by a 20min break
and the practical work (PW) from 10h40 to 12h15.
Lecture notes and solutions to practical work and exercises will be
updated here on the fly.
# 
Date 
Teacher 
Title 
1 
04/02/2021 
F. Bach 
Introduction

2 
11/02/2021 
F. Bach
R. Berthier 
Supervised
learning and linear regression
TD1
(Data: classificationA_train,
classificationA_test,
classificationB_train,
classificationB_test,
classificationC_train,
classificationC_test,
mnist_digits.mat,
solution,
NEW: all in one zip: ALL)

3 
18/02/2021 
F. Bach
R. Berthier 
Unsupervised
Learning


25/02/2021 

No Class 
4 
04/03/2021 
A. Rudi
R. Berthier 
Logistic
regression and convex analysis
TD3,
solution

5 
11/03/2021 
A. Rudi
R. Berthier 
Convex
optimization
TD4,
TD4englishversion,
solution
to theoretical questions, solution
to practical questions 
6 
18/03/2021 
F. Bach
R. Berthier 
High
dimensional statistics (Lasso)
Practical session on SGD: TD5,
data,
solution 
7 
25/03/2021 
F. Bach
R. Berthier 
Model
based machine learning: maximum likelihood
Practical session on kNN:
lectures notes (see Section 5),
data,
TP,
solution (in french)

8 
01/04/2021 
A. Rudi
R. Berthier 
Kernels
Exercise
sheet, solution 
9 
08/04/2021 
A. Rudi
R. Berthier 
Elements
of Statistical Machine Learning
Numerical
tour of Ridge and Lasso by Gabriel Peyre 
10 
15/04/2021 
A. Rudi
R. Berthier 
Local
methods
Numerical
tour of logistic classification by Gabriel Peyre


22/04/2021 

No Class


29/04/2021 

No Class

11 
06/05/2021 
A. Rudi
R. Berthier 
Neural
networks
TP
Neural Nets, solution


13/05/2021 

No Class

12 
20/05/2021 
F. Bach 
Summary 
13 
27/05/2021 
A. Rudi 
Exam 