Learning, recognition and localization of human actions in realistic videos such as movies, TV news and home recordings. We focus on atomic actions such as "drinking", "smoking", "hand shaking" and demonstrate action detection in challenging realistic scenarios with substantial variation
of actions in terms of subject appearance, motion,
surrounding scenes, viewing angles and spatio-temporal extents.
|