Data Processing and Classification

The Data Processing and Classification team is in the Department d'Informatique of Ecole Normale Superieure, and is a joint research team between École Normale Supérieure de Paris and the Centre National de la Recherche Scientifique : CNRS/ENS UMR 8548.

Our research is devoted to generic mathematical and algorithmic approaches to represent high-dimensional data for classifications. Wide range of applications are studied for speech and music, images and videos, geophysical seismic data, medical data, but in physics for quantum chemistry, fluid turbulences and cosmology. Revealing the information carried by these very different high-dimensional data relies on similar principles. They all suffer from a curse of dimensionality, which hides complex structures.

Our approaches are ground on mathematics and algorithms emmerging from harmonic analysis and geometry. They involve wavelet transforms, groups and manifolds, invariant and sparse representations, dictionary and kernel learning, deep neural networks and statistical learning. We also maintain close collaborations with teams in neurophysiology of perceptions as well physicists working in statistical physics, quantum chemistry and cosmology.

Research Topics

Invariant Representations with Scattering

Data classes suffer from a considerable variability that needs to be reduced without removing discriminative information. We study the construction of stable, informative invariants over groups mainly responsible for the data variability. This topic relies on scattering transforms which iterate on wavelet transforms along deep neural networks. Applications are developed to speech and music, images, medical signal analysis, and physics.

Sparse Dictionary and Unsupervised Group Learning

Learning representations from unlabeled data amounts to learn important structures in an unknown data world, from examples. Sparsity and dictionary learning are important tool to search for structure that can be modeled as groups or manifolds. Understanding deep neural network structures

Data Geometry

Geometric data representation still play a minor role in classification because of instabilities. Simple approaches including edge and boundary detections appear to not so effective in high dimension. Finding informative and stable geometric signal representation is not only important for images or videos but also for audio signals and music, where rhythm and long range structures are particular forms of geometry. Invariants and symmetry groups are important tools to tackle these issues.

Inverse Problems

Data classification often requires to reveal or separate hidden information. Separating multiple information sources is necessary to analyze complex audio signals and images. We study other non-linear inverse problems such as phase recovery , which appear in signal processing but also physics problems. Harmonic analysis, sparsity and convex optimization are major pillars to attack these questions.