ERC VideoWorld

Modeling, Interpreting, and Manipulating
Digital Video





VideoWorld is a research project funded by the European Research Council (ERC) and coordinated by Jean Ponce. It is located within the joint Department of Computer Science of Ecole Normale Superieure (UMR 8548, a joint  ENS/CNRS/INRIA laboratory) in downtown Paris.

Summary: Digital video is everywhere, at home, at work, and on the Internet. Yet, effective technology for organizing, retrieving, improving, and editing its content is nowhere to be found. Models for video content, interpretation and manipulation inherited from still imagery are obsolete, and new ones must be invented. With a new convergence between computer vision, machine learning, and signal processing, the time is right for such an endeavor. Concretely, we will develop novel spatio-temporal models of video content learned from training data and capturing both the local appearance and nonrigid motion of the elements (persons and their surroundings) that make up a dynamic scene. We will also develop formal models of the video interpretation process that leave behind the architectures inherited from the world of still images to capture the complex interactions between these elements, yet can be learned effectively despite the sparse annotations typical of video understanding scenarios. Finally, we will propose a unified model for video restoration and editing that builds on recent advances in sparse coding and dictionary learning, and will allow for unprecedented control of the video stream. This project addresses fundamental research issues, but its results are expected to serve as a basis for groundbreaking technological advances for applications as varied as film post-production, video archival, and smart camera phones.



Staff:
 


PI: Jean Ponce
Permanent staff:
Ivan Laptev and Josef Sivic
PhD students:
Louise Benoit, Florent Couzinie-Devy, Olivier Duchenne, Piotr Bojanowski (Fall 2012)
Post-doctoral researchers:
Minsu Cho, Jian Sun (Fall 2012)

Invited professors: Rene Vidal




Publications:

 

·         Journal articles:

o   J. Mairal, F. Bach, and J. Ponce. Task-Driven Dictionary Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4):791-804, 2012.  

o   O. Whyte, J. Sivic, A. Zisserman, and J. Ponce. Non-uniform Deblurring for Shaken Images. International Journal of Computer Vision, 98(2):168-186, 2012.

o   O. Duchenne, F. Bach, I. Kweon, and J. Ponce. A tensor-based algorithm for high-order graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33:2383 - 2395, 2011.

·        Articles published in the proceedings of peer-reviewed conferences

o   H. Azizpour and I. Laptev. Object Detection using Strongly-Supervised Deformable Part Models. Accepted for publication, European Conference on Computer Vision, 2012.

o   A. Joulin, F. Bach, and J. Ponce. Multi-Class Cosegmentation. In IEEE Conference on Computer Vision and Pattern Recognition, 2012.

o   L. Benoît, J. Mairal, F. Bach, and J. Ponce. Sparse Image Representation with Epitomes. In IEEE Conference on Computer Vision and Pattern Recognition, 2011.  

o   Y.-L. Boureau, N. Le Roux, F. Bach, J. Ponce, and Y. LeCun. Ask the locals: Multi-way local pooling for image recognition. In International Conference on Computer Vision, 2011.

o   O. Duchenne, A. Joulin, and J. Ponce. A Graph-Matching Kernel for Object Categorization. In International Conference on Computer Vision, 2011.

·         Submission

o   F. Couzinie-Devy, J. Mairal, F. Bach, and J. Ponce. Dictionary Learning for Deblurring and Digital Zoom. Submitted to Journal of Mathematical Imaging and Vision, 2012.

·         Defended PhD theses

o   Y-Lan Boureau, Learning Hierarchical Feature Extractors For Image Recognition, New York University, 2012.

o   Oliver Whyte, Removing Camera Shake Blur and Unwanted Occluders from Photographs, Ecole normale supérieure de Cachan, 2012