This
web page hosts supporting material for the book in preparation.
See draft available
here.
Generic helper functions
Chapter 1: Mathematical preliminaries
Figure 1.1
(expectation of maximum of Gaussian random variables)
Chapter 2: Introduction to supervised learning
Figure 2.1 (polynomial regression with increasing orders - predictions)
Figure 2.2 (polynomial regression with increasing orders - errors)
Chapter 3: Linear least-squares regression
Figure 3.1
(polynomial regression with varying number of observations)
Figure 3.2
(convergence rate for polynomial regression)
Figure 3.3
(polynomial ridge regression)
Chapter 4: Empirical risk minimization
Figure 4.1 (convex surrogates)
Figure 4.2 (optimal score functions for Gaussian class-conditional densities)
Chapter 5: Optimization
Figure 5.1
(gradient descent on two least-squares problems)
Figure 5.2 (comparison of step-sizes for SGD
for the support vector machine)
Figure 5.3
(comparison of step-sizes for SGD for logistic regression)
Chapter 6: Local averaging methods
Figure 6.2 (regressogram
in one dimension)
Figure 6.3 (k-nearest neighbor in one
dimension)
Figure 6.4 (Nadaraya-Watson
in one dimension)
Figure 6.5
(learning curves for local averaging)
Figure 6.6
(locally linear partitioning estimate)
Chapter 7: Kernel methods
Figure 7.2 (minimum norm interpolator)
Figure 7.3
(comparison of kernels)
Chapter 8: Sparse methods
Figure 8.1 (regularization
path)
Figure
8.2 (comparison of estimators) + script_model_selection.m + script_model_selectionROT.m
Chapter 9: Neural networks
Figure 9.1
(global convergence for different numbers of neurons) + launch_training_relu_nn.m
Figure 9.2
(random features - kernels)
Figure
9.3 (neural networks fitting)
Chapter
10: Ensemble learning
Figure 10.1 (bagged 1-nn
estimation)
Figure 10.2
(Gaussian random projections)
Figure 10.3 (Boosting)
Chapter
11: Overparameterized models
Figure 11.1 (logistic
regression on separable data)
Figures 11.2 and
11.3 (double descent curves, random non-linear features)
Figure 11.4
(double descent, random linear projections)
Chapter 12: Lower bounds
Chapter 13: Online learning and bandits
Figure 13.1 (zero-th order optimisation)
Figure 13.2 (UCB algorithm)
Chapter 14: Probabilistic methods
Figure 14.3 (MMSE
vs. MAP)
Figure 14.4 (Discriminative
vs. generative learning)
Chapter
15: Structured prediction
Figure 15.1 (robust
regression)
To be completed
Copyright in this Work has been licensed exclusively to The MIT Press,
http://mitpress.mit.edu, which will be releasing the final version to the
public in 2023. All inquiries regarding rights should be addressed to The MIT
Press, Rights and Permissions Department.