Summary
Statistical machine learning is a growing discipline at the intersection
of computer science and applied mathematics (probability / statistics,
optimization, etc.) and which increasingly plays an important role in technological
innovation.
Unlike a course on traditional statistics, statistical machine learning
is particularly focused on the analysis of data in high dimension, as well
as the efficiency of algorithms to process the large amount of data encountered
in multiple application areas such as image or sound analysis, natural
language processing, bioinformatics or finance.
The objective of this class is to present the main theories and algorithms
in statistical machine learning. The methods covered will rely amongst
others on convex analysis arguments. The practical sessions (more than
half of which will be realized with computers) will lead to simple implementations
of the algorithms seen in class and with applications to various domains
such as computer vision or natural language processing.
Prerequisite: probability theory (notion of random variables, convergence
of random variables, conditional expectation), coding skills in python.
General information
This class is part of the Computer science courses taught at ENS in L3
in Spring 2018-2019.
Teachers:
Pierre Gaillard and
Alessandro Rudi.
Practical sessions:
Raphaël Berthier.
The class will last 52 hours (30 hours of class + 22 hours of practical
sessions) and
can be validated for 9 ECTS.
Final grade: 50% final exam, 50% homework.
Previous years:
Fall 2018,
2017,
2016,
2015,
2014,
2013,
2012
Schedule and lecture notes
Tuesday mornings from 8h30 to 12h30
in room UV. Typical session
will be a lecture from 8h30 to 10h20, followed by a 20min break and the
practical work (PW) from 10h40 to 12h30.
Bring your personal laptops in
practical sessions! Lecture notes and solutions to practical work and exercises
will be updated here on the fly.
| # |
Date |
Teacher |
Title |
| 1 |
05/02/2019 |
A. Rudi
P. Gaillard |
Introduction
TD0 (Python test file) |
| 2 |
12/02/2019 |
P. Gaillard
R. Berthier |
Linear regression
TD1 (Data: classificationA_train, classificationA_test, classificationB_train, classificationB_test,
classificationC_train, classificationC_test, mnist_digits.mat), solution |
| 3 |
19/02/2019 |
A. Rudi
R. Berthier |
Statistical properties in ML
solution |
| 4 |
26/02/2019 |
A. Rudi
R. Berthier |
KNN
TD2, solution |
|
05/03/2019 |
|
Vacation
|
| 5 |
12/03/2019 |
P. Gaillard
R. Berthier |
Logistic regression and convex analysis
TD3, solution |
| 6 |
19/03/2019 |
A. Rudi
R. Berthier |
Convex optimization (good slides from Aurélien Garivier, GD smooth and strongly convex, SGD)
TD4, solution |
|
26/03/2019 |
|
No class |
| 7 |
02/04/2019 |
P. Gaillard
R. Berthier |
High dimensional statistics
TD4, solution -- first assignment available |
| 8 |
09/04/2019 |
P. Gaillard
R. Berthier |
Model based machine learning: maximum likelihood
TD5, data, solution |
| 9 |
16/04/2019 |
A. Rudi
R. Berthier |
Kernels (good notes from Arthur Gretton, sections 1, 2, 6)
|
|
23/04/2019 |
|
Vacation |
|
30/04/2019 |
|
Vacation |
| 10 |
07/05/2019 |
P. Gaillard
R. Berthier |
Unsupervised learning
TD7
-- first assignment due date and second assignment available |
| 11 |
14/05/2019 |
A. Rudi
R. Berthier |
Neural networks
TD8 - solution
|
| 12 |
21/05/2019 |
A. Rudi
R. Berthier |
Summary
Last semester exam, solution -- second assignment due date |
| 13 |
28/05/2019 |
P. Gaillard |
Exam |