Introduction to Machine Learning (2018 - 2019)


MANDATORY INSCRIPTION


Summary

Statistical machine learning is a growing discipline at the intersection of computer science and applied mathematics (probability / statistics, optimization, etc.) and which increasingly plays an important role in technological innovation.

Unlike a course on traditional statistics, statistical machine learning is particularly focused on the analysis of data in high dimension, as well as the efficiency of algorithms to process the large amount of data encountered in multiple application areas such as image or sound analysis, natural language processing, bioinformatics or finance.

The objective of this class is to present the main theories and algorithms in statistical machine learning. The methods covered will rely amongst others on convex analysis arguments. The practical sessions (more than half of which will be realized with computers) will lead to simple implementations of the algorithms seen in class and with applications to various domains such as computer vision or natural language processing.

Prerequisite: probability theory (notion of random variables, convergence of random variables, conditional expectation), coding skills in python.



General information

This class is part of the Computer science courses taught at ENS in L3 in Spring 2018-2019.

Teachers: Pierre Gaillard and Alessandro Rudi.
Practical sessions: Raphaël Berthier.

The class will last 52 hours (30 hours of class + 22 hours of practical sessions) and can be validated for 9 ECTS.
Final grade: 50% final exam, 50% homework.

Previous years: Fall 2018, 2017, 2016, 2015, 2014, 2013, 2012


Schedule and lecture notes

Tuesday mornings from 8h30 to 12h30 in room UV. Typical session will be a lecture from 8h30 to 10h20, followed by a 20min break and the practical work (PW) from 10h40 to 12h30. Bring your personal laptops in practical sessions! Lecture notes and solutions to practical work and exercises will be updated here on the fly.

# Date Teacher Title
1 05/02/2019 A. Rudi
P. Gaillard
Introduction
TD0 (Python test file)
2 12/02/2019 P. Gaillard
R. Berthier
Linear regression
TD1 (Data: classificationA_train, classificationA_test, classificationB_train, classificationB_test, classificationC_train, classificationC_test, mnist_digits.mat), solution
3 19/02/2019 A. Rudi
R. Berthier
Statistical properties in ML
solution
4 26/02/2019 A. Rudi
R. Berthier
KNN
TD2, solution
05/03/2019 Vacation
5 12/03/2019 P. Gaillard
R. Berthier
Logistic regression and convex analysis
TD3, solution
6 19/03/2019 A. Rudi
R. Berthier
Convex optimization (good slides from Aurélien Garivier, GD smooth and strongly convex, SGD)
TD4, solution
26/03/2019 No class
7 02/04/2019 P. Gaillard
R. Berthier
High dimensional statistics
TD4, solution -- first assignment available
8 09/04/2019 P. Gaillard
R. Berthier
Model based machine learning: maximum likelihood
TD5, data, solution
9 16/04/2019 A. Rudi
R. Berthier
Kernels (good notes from Arthur Gretton, sections 1, 2, 6)
23/04/2019 Vacation
30/04/2019 Vacation
10 07/05/2019 P. Gaillard
R. Berthier
Unsupervised learning
TD7 -- first assignment due date and second assignment available
11 14/05/2019 A. Rudi
R. Berthier
Neural networks
TD8 - solution
12 21/05/2019 A. Rudi
R. Berthier
Summary
Last semester exam, solution -- second assignment due date
13 28/05/2019 P. Gaillard Exam