Weakly Supervised Action Labeling in Videos Under Ordering Constraints

People

Abstract

We are given a set of video clips, each one annotated with an ordered list of actions, such as “walk” then “sit” then “answer phone” extracted from, for example, the associated text script. We seek to temporally localize the individual actions in each clip as well as to learn a discriminative classifier for each action. We formulate the problem as a weakly supervised temporal assignment with ordering constraints. Each video clip is divided into small time intervals and each time interval of each video clip is assigned one action label, while respecting the order in which the action labels appear in the given annotations. We show that the action label assignment can be determined together with learning a classifier for each action in a discriminative manner. We evaluate the proposed model on a new and challenging dataset of 937 video clips with a total of 787720 frames containing sequences of 16 different actions from 69 Hollywood movies.

Paper

[ECCV 2014 Paper] [Technical Report on arXiv] [Poster] [Supmat Video]

BibTeX

@InProceedings{Bojanowski14weakly,
    author      = "Bojanowski, Piotr and Lajugie, R\'emi and Bach, Francis and Laptev, Ivan and Ponce, Jean and Schmid, Cordelia and Sivic, Josef",
    title       = "Weakly Supervised Action Labeling in Videos Under Ordering Constraints",
    booktitle   = "Proc. ECCV",
    year        = "2014"
}

Weakly Supervised Learning Code

GitHub project page : action-ordering
Packaged Code
dataset.mat in tar.gz format (252MB)

Full Dataset

The full dataset with the 937 clips is available here. The full_dataset.mat file contains the precomuted features, ground truth labels and annotation sequences. A matlab structure hw3 contains the clip identifiers corresponding to the provided avi files.

full_dataset.mat in tar.gz format (189MB)
Video clips in avi format (7670MB) (Achtung! Big file ahead!)

Acknowledgements

This work was supported by the European integrated project AXES, the MSR-INRIA laboratory, EIT-ICT labs, a Google Research Award, a PhD fellowship from the EADS Foundation, the Institut Universitaire de France and ERC grants ALLEGRO, VideoWorld, Activia and Sierra.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.