Loucas PillaudVivien
Research interests
My main research interests are convex optimization, statistics and PDEs. More precisely, here is are a selection of research topics I am interested in:
Publications
T. Lelievre, L. PillaudVivien, J. Reygner. Central Limit Theorem for stationary FlemingViot particle systems in finite spaces. [arXiv:1806.04490, pdf], accepted in ALEA Latin American Journal of Probability and Mathematical Statistics, 2018. [Show Abstract]
Abstract: We consider the FlemingViot particle system associated with a continuoustime Markov chain in a finite space. Assuming irreducibility, it is known that the particle system possesses a unique stationary distribution, under which its empirical measure converges to the quasistationary distribution of the Markov chain. We complement this Law of Large Numbers with a Central Limit Theorem. Our proof essentially relies on elementary computations on the infinitesimal generator of the FlemingViot particle system, and involves the socalled πreturn process in the expression of the asymptotic variance. Our work can be seen as an infinitetime version, in the setting of finite space Markov chains, of recent results by Cérou, Delyon, Guyader and Rousset [ arXiv:1611.00515, arXiv:1709.06771].
L. PillaudVivien, A. Rudi, F. Bach. Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes. [arXiv:1805.10074, pdf, poster], Advances in Neural Information Processing Systems (NIPS), 2018. [Show Abstract]
Abstract: We consider stochastic gradient descent (SGD) for leastsquares regression with potentially several passes over the data. While several passes have been widely reported to perform practically better in terms of predictive performance on unseen data, the existing theoretical analysis of SGD suggests that a single pass is statistically optimal. While this is true for lowdimensional easy problems, we show that for hard problems, multiple passes lead to statistically optimal predictions while single pass does not; we also show that in these hard models, the optimal number of passes over the data increases with sample size. In order to define the notion of hardness and show that our predictive performances are optimal, we consider potentially infinitedimensional models and notions typically associated to kernel methods, namely, the decay of eigenvalues of the covariance matrix of the features and the complexity of the optimal predictor as measured through the covariance matrix. We illustrate our results on synthetic experiments with nonlinear kernel methods and on a classical benchmark with a linear model.
L. PillaudVivien, A. Rudi, F. Bach. Exponential convergence of testing error for stochastic gradient methods. [arXiv:1712.04755, pdf, video, poster], Proceedings of the International Conference on Learning Theory (COLT), 2017. [Show Abstract]
Abstract: We consider binary classification problems with positive definite kernels and square loss, and study the convergence rates of stochastic gradient methods. We show that while the excess testing loss (squared loss) converges slowly to zero as the number of observations (and thus iterations) goes to infinity, the testing error (classification error) converges exponentially fast if lownoise conditions are assumed.
