Reconnaissance d’objets et vision artificielle 2013/2014
Object recognition and computer vision 2013/2014


Jean Ponce, Ivan Laptev, Cordelia Schmid and Josef Sivic


Course Information

Room: ENS Ulm  Salle UV aile Rataud, 45 rue d'Ulm

Class time: Tuesday 16:15-19:15

News:

List of received reports:

https://docs.google.com/spreadsheet/ccc?key=0Aso5oi2c4UB5dDJuRlJZYUZ1X19VdERzX0hpcEpNc2c#gid=0

Course description

Automated  object  recognition -- and  more  generally  scene  analysis -- from  photographs  and videos  is  the  grand  challenge  of  computer  vision.  This  course  presents  the  image,  object,  and scene models, as well as the methods and algorithms, used today to address this challenge.

Assignments

There will be three programming assignments representing 50% (10% + 20% + 20%) of the grade. The supporting materials for the programming assignments and final projects will be in Matlab.

Final project

The final project will represent 50% of the grade. Suggested topics for final projects will be added here.

Computer vision and machine learning talks

You are welcome to attend seminars in the Willow group. Please see the current seminar schedule. Typically, these are one hour research talks given by visiting speakers. The talks are at 23 avenue d'Italie. Ring the bell to get into the building, then take the elevator to the 5th floor.

Course schedule (subject to change):

Lecture

Date

Topic and reading materials.

Slides

1

Oct 1

Introduction (J. Ponce);

Instance-level recognition I. - Camera geometry (J. Ponce)

Class logistics, assignments, final projects (I. Laptev and J. Sivic)

Background materials: History: J. Mundy - Object recognition in the geometric era: A retrospective.; Camera geometry: Forsyth&Ponce Ch.1-2. Hartley&Zisserman - Ch.6

PDF1
PDF2

2

Oct 8

Instance-level recognition II. - Local invariant features (C. Schmid)

Materials: Mikolajczyk & Schmid, Scale and affine invariant interest point detectors, IJCV 2004; D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 2004, R. Szeliski (pdf), Sections 4.1, 4.1.1 and 4.1.2 from Chapter 4: Feature detection and matching.

Assignments:
Assignment 1 out

PDF1
PDF2

3

Oct 15

Instance-level recognition III. - Correspondence, efficient visual search (J. Sivic)

Materials: R. Szeliski (pdf), Sections 4.1.3 (feature matching) and 6.1 (feature-based alignment); Muja & Lowe, Fast approx. nearest neighbors with automatic algorithm configuration, VISAPP'09; Sivic & Zisserman, Video Google: Efficient visual search of videos (chapter from this book)

Philbin et al., Object retrieval with large vocabularies and fast spatial matching, CVPR'07.

PDF1

PDF2

4

Oct 22

Instance-level recognition IV. - Very large scale image indexing (C. Schmid)

Materials: Jegou et al., Improving bag-of-features for large scale image search, IJCV 2010; Jegou et al., Aggregating local image descriptors into compact codes, PAMI 2011;

Bag-of-feature models for category-level recognition (C. Schmid)

Materials: Csurka et al., Visual categorization with bags of keypoints, 2004

Assignments:
Assignment 1 due

PDF1

PDF2

5

Oct 29

Sparse coding and dictionary learning for image analysis (J. Ponce)

Materials: Bach, Mairal, Ponce, Sapiro, Tutorial on sparse coding and dictionary learning for image analysis, at CVPR'10.

Category-level localization I. (J. Sivic)

Materials: Fergus et al., A Sparse Object Category Model for Efficient Learning and Complete Recognition (constellation model) (chapter from this book); Leibe et al., An Implicit Shape Model for Combined Object Categorization and Segmentation (chapter from this book); Dalal&Triggs, A histogram of oriented gradients (HOG) for human detection, CVPR'05

Assignments:

Assignment 2 out 

Topic suggestions for the final project are out

PDF1

PDF2

6

Nov 5

Category-level localization II. - Efficient fitting of pictorial structures; Human pose estimation (I. Laptev)

Materials: Felzenszwalb et al., A Discriminatively Trained, Multiscale, Deformable Part Model, CVPR’08; Pascal VOC Challenge; Ramazan et al., Segmentation Driven Object Detection with Fisher Vectors, ICCV’13; Yang and Ramanan, Articulated Human Detection with Flexible Mixtures of Parts, PAMI’13.

Assignments:

Assignment 2 due

Assignment 3 out

PDF1

PDF2

7

Nov 12

Motion and human actions I. (I. Laptev)

Materials: Laptev et al., Learning realistic human actions from movies, CVPR’08; Want et al., Dense trajectories and motion boundary descriptors for action recognition, CVPR’11.

Assignments:

Final project proposal due (Nov 12)

PDF1

PDF2

8

Nov 19

Motion and human actions II.

Face detection and recognition. (C. Schmid)

Materials:

P. Viola, M. Jones: Robust Real-Time Face Detection. International Journal of Computer Vision 57(2): 137-154 (2004)

M. Guillaumin, T. Mensink, J. Verbeek, C. Schmid

Face recognition from caption-based supervision

International Journal of Computer Vision, Springer, 2012, 96 (1), pp. 64-82

Assignments:
Assignment 3 due (Nov 19)

PDF1

PDF2

9

Nov 26

Scenes and Objects (I. Laptev)

Materials: A. Oliva and A. Torralba: Modeling the shape of the scene: A holistic representation of the spatial envelope, IJCV 2001; J. Xiao et al.: Sun database: Large-scale scene recognition from abbey to zoo, CVPR 2010; D.Hoiem et al.: Putting Objects in Perspective, CVPR 2006; C. Desai et al.: Discriminative models for multi-class object layout, CVPR 2009; N. Kumar et al.: Attribute and simile classifiers for face verification, ICCV 2009.

PDF1

PDF2

10

Dec 3

Neural networks; Optimization methods (N. Le Roux)

Materials:

PDF

11

Dec 10

No lecture

 

12

Dec 17

Dec 18

Final project presentations and evaluation (I. Laptev, J. Sivic)

Presentation schedule.

Tue Dec 17 presentations are at the standard class location and time (16:15-19:15 ENS Ulm).

Wed Dec 18 presentations are at INRIA, 23 Av. d’Italie, 75013. See the link above for directions and exact time schedule.

Relevant literature:

[1]

D.A. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", Prentice-Hall, 2nd edition, 2011

[2]

J. Ponce, M. Hebert, C. Schmid and A. Zisserman "Toward Category-Level Object Recognition", Lecture Notes in Computer Science 4170, Springer-Verlag, 2007

[3]

O. Faugeras, Q.T. Luong, and T. Papadopoulo, "Geometry of Multiple Images", MIT Press, 2001.

[4]

R. Hartley and A. Zisserman, "Multiple View Geometry in Computer Vision", Cambridge University Press, 2004.

[5]

J. Koenderink, "Solid Shape", MIT Press, 1990

[6]

R. Szeliski, "Computer Vision: Algorithms and Applications", 2009. A draft of a new book, which can be downloaded online.