 |
VideoWorld
Modeling, Interpreting, and Manipulating
Digital Video
|
VideoWorld
is a research project funded by the European
Research Council (ERC) and coordinated by Jean Ponce. It is located
within the joint Department of Computer
Science of Ecole Normale Superieure (UMR 8548, a joint
ENS/CNRS/INRIA laboratory)
in downtown Paris.
Summary: Digital video is
everywhere, at home, at work, and on the Internet. Yet, effective
technology for organizing, retrieving, improving, and editing its
content is nowhere to be found. Models for video content,
interpretation and manipulation inherited from still imagery are
obsolete, and new ones must be invented. With a new convergence between
computer vision, machine learning, and signal processing, the time is
right for such an endeavor. Concretely, we will develop novel
spatio-temporal models of video content learned from training data and
capturing both the local appearance and nonrigid motion of the elements
(persons and their surroundings) that make up a dynamic scene. We will
also develop formal models of the video interpretation process that
leave behind the architectures inherited from the world of still images
to capture the complex interactions between these elements, yet can be
learned effectively despite the sparse annotations typical of video
understanding scenarios. Finally, we will propose a unified model for
video restoration and editing that builds on recent advances in sparse
coding and dictionary learning, and will allow for unprecedented
control of the video stream. This project addresses fundamental
research issues, but its results are expected to serve as a basis for
groundbreaking technological advances for applications as varied as
film post-production, video archival, and smart camera phones.
Staff:
PI: Jean Ponce
Permanent staff: Ivan Laptev and Josef Sivic
PhD students: Louise Benoit,
Florent Couzinie-Devy, Piotr Bojanowski, Rafael Sampaio de Rezende (Fall 2013)
Post-doctoral researchers: Minsu Cho, Jian Sun
Invited professors: Alyosha Efros, Rene Vidal
Alumni: Y-Lan Boureau (post-doc, NYU), Olivier Duchenne (Intel, Korea), Armand Joulin (post-doc, Stanford), Oliver Whyte (Microsoft)
Publications:
·
Journal
articles:
o
J. Mairal, F.
Bach, and J. Ponce.
Task-Driven Dictionary Learning.
IEEE Transactions on Pattern Analysis
and Machine Intelligence, 34(4):791-804, 2012.
o
O. Whyte, J.
Sivic, A. Zisserman,
and J. Ponce. Non-uniform Deblurring
for Shaken Images. International
Journal of Computer Vision, 98(2):168-186, 2012.
o
O. Duchenne,
F. Bach, I. Kweon,
and J. Ponce. A tensor-based algorithm
for high-order graph matching. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
33:2383 -
2395, 2011. 
·
Articles
published in the
proceedings of peer-reviewed conferences
o
P. Bojanowski, F. Bach, I. Laptev, J. Ponce, C. Scdmid, J. Sivic.
Finding Actors and Actions in Movies. Accepted for publication in
International Conference on Computer Vision, 2013.
o M. Cho, K. Alahari, and J. Ponce. Learning Graphs to Match. Accepted for publication in
International Conference on Computer Vision, 2013.
o J. Sun and J. Ponce.
Learning
Discriminative Part Detectors for Image Classification and Cosegmentation. Accepted for publication in International Conference on Computer Vision, 2013.
o F. Couzinié-Devy, J.
Sun, K. Alahari, J. Ponce. Learning to Estimate
and Remove Non-uniform Image Blur. In IEEE Conference on Computer Vision and Pattern Recognition, 2013.
o A. Joulin and S.B. Kang. Recovering Stereo Pairs from Anaglyphs. In IEEE Conference on Computer Vision and Pattern Recognition, 2013.
o M. Rubinstein, A. Joulin, J. Kopf. Unsupervised Joint Object Discovery and Segmentation in Internet Images. In IEEE Conference on Computer Vision and Pattern Recognition, 2013.
o V. Delaitre,
D. Fouhey,
I. Laptev,
J. Sivic,
A. Gupta,
and A. Efros.
Scene semantics from long-term observation of people.
In
European Conference on Computer Vision,
2012.
o D. Fouhey,
V. Delaitre,
A. Gupta,
A. Efros,
I. Laptev,
and J. Sivic.
People Watching: Human Actions as a Cue for Single-View Geometry.
In
European Conference on Computer Vision,
2012.
o A. Joulin and F. Bach.
A convex relaxation for weakly supervised classifiers.
In
International Conference on Machine Learning,
2012.
o
H.
Azizpour and I. Laptev. Object Detection using Strongly-Supervised
Deformable
Part Models.
European Conference on Computer
Vision,
2012.
o A. Joulin, F.
Bach, and J. Ponce.
Multi-Class Cosegmentation. In
IEEE
Conference on Computer Vision and
Pattern Recognition, 2012.
o L. Benoît, J.
Mairal, F.
Bach, and J. Ponce.
Sparse Image
Representation with Epitomes. In
IEEE
Conference on Computer Vision and Pattern Recognition, 2011.
o Y.-L. Boureau,
N. Le Roux, F.
Bach, J. Ponce, and Y. LeCun.
Ask the
locals: Multi-way local pooling for image recognition. In
International Conference on Computer Vision,
2011.
o O. Duchenne,
A. Joulin, and
J. Ponce.
A Graph-Matching Kernel for
Object Categorization. In
International
Conference on Computer Vision, 2011.

·
Submission
o F.
Couzinie-Devy, J. Mairal,
F. Bach, and J. Ponce. Dictionary
Learning for Deblurring and
Digital Zoom. Submitted to Journal
of
Mathematical Imaging and Vision, 2013.
·
Defended PhD
theses
o O. Duchenne. Non-Rigid Inage Alignment for Object Recognition, Ecole normale supérieure de Cachan, 2013. 
o A. Joulin. Convex Optimization for Cosegmentation, Ecole normale supérieure de Cachan, 2013.
o Y-Lan Boureau,
Learning
Hierarchical Feature Extractors For Image Recognition, New York University,
2012. 
o Oliver Whyte, Removing Camera Shake Blur and Unwanted
Occluders from Photographs, Ecole normale supérieure de Cachan, 2012. 