The class will be taught in French or English, depending on attendance (all slides and class notes are in English).
Summary
Given the
sanitary situation, all classes will be online, with the following
tentative plan. Detailed class notes will be made available
2 days before each class, while connection details sent to the
registered participants the night before (I will use GotoMeeting).
Each
student is expected to read the class notes before the class.
During class, I will go over them, provide additional details and
answer questions. Classes will be held on Friday between
8.30am and 11.30am.
Date | Topics | Class
notes |
September 18 |
Learning with infinite data (population setting) -Decision theory (loss, risk, optimal predictors) -Decomposition of excess risk into approximation and estimation errors -No free lunch theorems -Basic notions of concentration inequalities (MacDiarmid, Hoeffding, Bernstein) |
lecture1.pdf |
September 25 |
Liner Least-squares regression -Guarantees in the fixed design settings (simple in closed-form) -Ridge regression: dimension independent bounds -Guarantees in the random design settings -Lower bound of performance |
lecture2.pdf |
October 2 | Empirical risk minimization -Convexification of the risk -Risk decomposition -Estimation error: finite number of hypotheses and covering numbers -Rademacher complexity -Penalized problems |
lecture3.pdf |
October 16 |
Optimization for machine learning -Gradient descent -Stochastic gradient descent -Generalization bounds through stochastic gradient descent |
lecture4.pdf |
October 23 | Local averaging techniques -Partition estimators -Nadaraya-Watson estimators -K-nearest-neighbors -Universal consistency |
lecture5.pdf |
October 30 |
Kernel methods -Kernels and representer theorems -Algorithms -Analysis of well-specified models -Sharp analysis of ridge regression -Universal consistency |
lecture6.pdf |
November 6 |
Model selection -L0 penalty -L1 penalty -High-dimensional estimation |
lecture7.pdf |
November 13 |
Neural networks -Single hidden layer neural networks - Estimation error - Approximation properties and universality |
lecture8.pdf |
November 20 |
Special topics -Generalization/optimization properties of infinitely wide neural networks -Double descent |
lecture9.pdf |
December 4 |
EXAM |
Evaluation
One
written in-class exam, and (very) simple coding assignments (to
illustrate convergence results, to be sent to
learning.theory.first.principles@gmail.com). For all classes, the
coding assignment is to reproduce the experiments shown in the lecture
notes and send only the figures to the address above.