Publications

Back to publication index

Publications of Ivan Laptev

Articles in journal or book chapters

  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Learning to Answer Visual Questions from Web Videos. TPAMI, 2022.


  • Q. Le Lidec, I. Kalevatykh, I. Laptev, C. Schmid, and J. Carpentier. Differentiable simulation for physical system identification. IEEE Robotics and Automation Letters, 2021.


  • G. Varol, I. Laptev, C. Schmid, and A. Zisserman. Synthetic Humans for Action Recognition from Unseen Viewpoints. International Journal of Computer Vision, 2021.


  • Y. Labbé, S. Zagoruyko, I. Kalevatykh, I. Laptev, J. Carpentier, M. Aubry, and J. Sivic. Monte-Carlo Tree Search for Efficient Visually Guided Rearrangement Planning. IEEE Robotics and Automation Letters, 2020.


  • Robin Strudel, Ricardo Garcia, Justin Carpentier, Jean-Paul Laumond, Ivan Laptev, and Cordelia Schmid. Learning Obstacle Representations for Neural Motion Planning. arXiv:2008.11174, 2020.


  • J.-B. Alayrac, P. Bojanowski, N. Agrawal, I. Laptev, J. Sivic, and S. Lacoste-Julien. Learning from Narrated Instruction Videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(9):2194-2208, 2018.


  • G. Varol, I. Laptev, and C. Schmid. Long-term Temporal Convolutions for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1510-1517, 2018.


  • H. Idrees, A. R. Zamir, Y.-G. Jiang, A. Gorban, I. Laptev, R. Sukthankar, and M. Shah. The THUMOS challenge on action recognition for videos ''in the wild''. Computer Vision and Image Understanding, 155:1-23, 2017.


  • G. Seguin, K. Alahari, J. Sivic, and I. Laptev. Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8):1643-1655, 2015.


  • M. Rodriguez, J. Sivic, and I. Laptev. Analysis of Crowded Scenes in Video. In Intelligent Video Surveillance Systems, pages 251-272. Wiley, 2012.


  • I. Junejo, E. Dexter, I. Laptev, and P. Pérez. View-Independent Action Recognition from Temporal Self-Similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):172-185, 2011.


Conference articles

  • Elliot Chane-Sane, Cordelia Schmid, and Ivan Laptev. Learning Video-Conditioned Policies for Unseen Manipulation Tasks. In ICRA, 2023.


  • Antoine Yang, Arsha Nagrani, Ivan Laptev, Josef Sivic, and Cordelia Schmid. VidChapters-7M: Video Chapters at Scale. In NeurIPS, 2023.


  • Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, and Cordelia Schmid. Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. In CVPR, 2023.


  • S. Chen, P.-L. Guhur, M. Tapaswi, C. Schmid, and I. Laptev. Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.


  • T. Soucek, J.-B. Alayrac, A. Miech, I. Laptev, and J. Sivic. Look for the Change: Learning Object States and State-Modifying Actions From Untrimmed Web Videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. TubeDETR: Spatio-Temporal Video Grounding with Transformers. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Zero-Shot Video Question Answering via Frozen Bidirectional Language Models. In NeurIPS, 2022.


  • E. Chane-Sane, C. Schmid, and I. Laptev. Goal-Conditioned Reinforcement Learning with Imagined Subgoals. In ICML, 2021.


  • S. Chen, P.-L. Guhur, C. Schmid, and I. Laptev. History Aware Multimodal Transformer for Vision-and-Language Navigation. In NeurIPS, 2021.


  • Q. Le Lidec, I. Laptev, C. Schmid, and J. Carpentier. Differentiable rendering with perturbed optimizers. In NeurIPS, 2021.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Just Ask: Learning to Answer Questions from Millions of Narrated Videos. In International Conference on Computer Vision, 2021.


  • Hazel Doughty, Ivan Laptev, Walterio Mayol-Cuevas, and Dima Damen. Action Modifiers: Learning from Adverbs in Instructional Videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.


  • Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, and Cordelia Schmid. Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.


  • Anna Kukleva, Makarand Tapaswi, and Ivan Laptev. Learning Interactions and Relationships between Movie Characters. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.


  • Antoine Miech, Jean-Baptiste Alayrac, Lucas Smaira, Ivan Laptev, Josef Sivic, and Andrew Zisserman. End-to-end learning of visual representations from uncurated instructional videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.


  • Alexander Pashevich, Igor Kalevatykh, Ivan Laptev, and Cordelia Schmid. Learning visual policies for building 3D shape categories. In International Conference on Intelligent Robots and Systems, 2020.


  • Robin Strudel, Alexander Pashevich, Igor Kalevatykh, Ivan Laptev, Josef Sivic, and Cordelia Schmid. Learning to combine primitive skills: A step towards versatile robotic manipulation. In International Conference on Robotics and Automation, 2020.


  • D. Zhukov, J.-B. Alayrac, I. Laptev, and J. Sivic. Learning Actionness via Long-range Temporal Order Verification. In European Conference on Computer Vision, 2020.


  • Y. Hasson, G. Varol, D. Tzionas, I. Kalevatykh, M. J. Black, I. Laptev, and C. Schmid. Learning joint reconstruction of hands and manipulated objects. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • Z. Li, J. Sedlar, J. Carpentier, I. Laptev, N. Mansard, and J. Sivic. Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, and Josef Sivic. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips. In International Conference on Computer Vision, 2019.


  • Alexander Pashevich, Robin Strudel, Igor Kalevatykh, Ivan Laptev, and Cordelia Schmid. Learning to Augment Synthetic Images for Sim2Real Policy Transfer. In International Conference on Intelligent Robots and Systems, 2019.


  • D. Zhukov, J.-B. Alayrac, R.G. Cinbis, D. Fouhey, I. Laptev, and J. Sivic. Cross-task weakly supervised learning from instructional videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • G. Chéron, J.-B. Alayrac, I. Laptev, and C. Schmid. A Flexible Model for Training Action Localization with Varying Levels of Supervision. In Advances in Neural Information Processing Systems, 2018.


  • G. Varol, D. Ceylan, B. Russell, J. Yang, E. Yumer, I. Laptev, and C. Schmid. BodyNet: Volumetric Inference of 3D Human Body Shapes. In European Conference on Computer Vision, 2018.


  • J.-B. Alayrac, J. Sivic, I. Laptev, and S. Lacoste-Julien. Joint Discovery of Object States and Manipulation Actions. In International Conference on Computer Vision, 2017.


  • A. Miech, J.-B. Alayrac, P. Bojanowski, I. Laptev, and J. Sivic. Learning from Video and Text via Large-Scale Discriminative Clustering. In International Conference on Computer Vision, 2017.


  • J. Peyre, I. Laptev, C. Schmid, and J. Sivic. Weakly-supervised learning of visual relations. In International Conference on Computer Vision, 2017.


  • G. Varol, J. Romero, X. Martin, N. Mahmood, M. J. Black, I. Laptev, and C. Schmid. Learning from Synthetic Humans. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.


  • J.-B. Alayrac, P. Bojanowski, N. Agrawal, I. Laptev, J. Sivic, and S. Lacoste-Julien. Unsupervised Learning from Narrated Instruction Videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.


  • V. Kantorov, M. Oquab, M. Cho, and I. Laptev. ContextLocNet: Context-aware Deep Network Models for Weakly Supervised Localization. In European Conference on Computer Vision, 2016.


  • S. Kwak, M. Cho, and I. Laptev. Thin-Slicing for Pose: Learning to Understand Pose without Explicit Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.


  • G. Seguin, P. Bojanowski, R. Lajugie, and I. Laptev. Instance-level video segmentation from object tracks. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.


  • G. A. Sigurdsson, G. Varol, X. Wang, I. Laptev, A. Farhadi, and A. Gupta. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding. In European Conference on Computer Vision, 2016.


  • P. Bojanowski, R. Lajugie, E. Grave, F. Bach, I. Laptev, J. Ponce, and C. Schmid. Weakly-Supervised Alignment of Video With Text. In International Conference on Computer Vision, 2015.


  • V. Chari, S. Lacoste-Julien, I. Laptev, and J. Sivic. On Pairwise Costs for Network Flow Multi-Object Tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.


  • G. Cheron, I. Laptev, and C. Schmid. P-CNN: Pose-based CNN Features for Action Recognition. In International Conference on Computer Vision, 2015.


  • S. Kwak, M. Cho, I. Laptev, J. Ponce, and C. Schmid. Unsupervised Object Discovery and Tracking in Video Collections. In International Conference on Computer Vision, 2015.


  • M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Is object localization for free? -- Weakly-supervised learning with convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.


  • T.-H. Vu, A. Osokin, and I. Laptev. Context-Aware CNNs for Person Head Detection. In International Conference on Computer Vision, 2015.


  • P. Bojanowski, R. Lajugie, F. Bach, I. Laptev, J. Ponce, C. Schmid, and J. Sivic. Weakly Supervised Action Labeling in Videos Under Ordering Constraints. In European Conference on Computer Vision, 2014.


  • V. Kantorov and I. Laptev. Efficient feature extraction, encoding and classification for action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2014.


  • M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2014.


  • T.-H. Vu, C. Olsson, I. Laptev, A. Oliva, and J. Sivic. Predicting Actions from Static Scenes. In European Conference on Computer Vision, 2014.


  • K. Alahari, G. Seguin, J. Sivic, and I. Laptev. Pose Estimation and Segmentation of People in 3D Movies. In International Conference on Computer Vision, 2013.


  • H. Azizpour and I. Laptev. Object Detection Using Strongly-Supervised Deformable Part Models. In European Conference on Computer Vision, 2012.


  • V. Delaitre, D. Fouhey, I. Laptev, J. Sivic, A. Gupta, and A. Efros. Scene semantics from long-term observation of people. In European Conference on Computer Vision, 2012.


  • D. Fouhey, V. Delaitre, A. Gupta, A. Efros, I. Laptev, and J. Sivic. People Watching: Human Actions as a Cue for Single-View Geometry. In European Conference on Computer Vision, 2012.


  • M. Ullah and I. Laptev. Actlets: A novel local representation for human action recognition in video. In International Conference on Image Processing, 2012.


  • V. Delaitre, J. Sivic, and I. Laptev. Learning person-object interactions for action recognition in still images. In Advances in Neural Information Processing Systems, 2011.


  • J. Lezama, K. Alahari, J. Sivic, and I. Laptev. Track to the Future: Spatio-temporal Video Segmentation with Long-range Motion Cues. In IEEE Conference on Computer Vision and Pattern Recognition, 2011.


  • K. Raja, I. Laptev, P. Perez, and L. Oisel. Joint pose estimation and action recognition in image graphs. In International Conference on Image Processing, 2011.


  • M. Rodriguez, I. Laptev, J. Sivic, and J.-Y. Audibert. Density-aware person detection and tracking in crowds. In International Conference on Computer Vision, 2011.


  • M. Rodriguez, J. Sivic, I. Laptev, and J.-Y. Audibert. Data-driven Crowd Analysis. In International Conference on Computer Vision, 2011.


  • N. Cherniavsky, I. Laptev, J. Sivic, and A. Zisserman. Semi-supervised learning of facial attributes in video. In The first international workshop on parts and attributes (in conjunction with ECCV 2010), 2010.


  • V. Delaitre, I. Laptev, and J. Sivic. Recognizing human actions in still images: a study of bag-of-features and part-based representations. In British Machine Vision Conference, 2010. Note: Updated version, available at http://www.di.ens.fr/willow/research/stillactions/.


  • M.M. Ullah, S.N. Parizi, and I. Laptev. Improving Bag-of-Features Action Recognition with Non-Local Cues. In British Machine Vision Conference, 2010.


  • E. Dexter, P Pérez, and I. Laptev. Multi-View Synchronization of Human Actions and Dynamic Scenes. In British Machine Vision Conference, 2009.


  • O. Duchenne, I. Laptev, J. Sivic, F. Bach, and J. Ponce. Automatic Annotation of Human Actions in Video. In International Conference on Computer Vision, 2009.


  • M. Marsza\lek, I. Laptev, and C. Schmid. Actions in Context. In IEEE Conference on Computer Vision and Pattern Recognition, 2009.


  • H. Wang, M.M. Ullah, A. Kläser, I. Laptev, and C. Schmid. Evaluation of local spatio-temporal features for action recognition. In British Machine Vision Conference, 2009.



Back to publication index




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.




This document was translated from BibTEX by bibtex2html