Publications

Back to publication index

Publications of Josef Sivic

Articles in journal or book chapters

  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Learning to Answer Visual Questions from Web Videos. TPAMI, 2022.


  • Y. Labbé, S. Zagoruyko, I. Kalevatykh, I. Laptev, J. Carpentier, M. Aubry, and J. Sivic. Monte-Carlo Tree Search for Efficient Visually Guided Rearrangement Planning. IEEE Robotics and Automation Letters, 2020.


  • J.-B. Alayrac, P. Bojanowski, N. Agrawal, I. Laptev, J. Sivic, and S. Lacoste-Julien. Learning from Narrated Instruction Videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(9):2194-2208, 2018.


  • I. Rocco, R. Arandjelovic, and J. Sivic. Convolutional neural network architecture for geometric matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.


  • M. Aubry, B. Russell, and J. Sivic. Visual Analysis and Geolocalization of Large-Scale Imagery, chapter Visual geo-localization of non-photographic depictions via 2D-3D alignment. Springer, 2015.


  • G. Seguin, K. Alahari, J. Sivic, and I. Laptev. Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8):1643-1655, 2015.


  • A. Torii, J. Sivic, T. Pajdla, and M. Okutomi. Visual place recognition with repetitive structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(11):2346-2359, 2015.


  • M. Aubry, B. Russell, and J. Sivic. Painting-to-3D Model Alignment Via Discriminative Visual Elements. ACM Transactions on Graphics, 33(2):14:1-14:14, 2014.


  • M. Rodriguez, J. Sivic, and I. Laptev. Analysis of Crowded Scenes in Video. In Intelligent Video Surveillance Systems, pages 251-272. Wiley, 2012.


  • Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. What Makes Paris Look like Paris?. ACM Transactions on Graphics (SIGGRAPH), 31(4):101:1-101:9, 2012.


  • O. Whyte, J. Sivic, A. Zisserman, and J. Ponce. Non-uniform Deblurring for Shaken Images. International Journal of Computer Vision, 98(2):168-186, 2012.


  • B. Kaneva, J. Sivic, A. Torralba, S. Avidan, and W. T. Freeman. Infinite Images: Creating and Exploring a Large Photorealistic Virtual Space. Proceedings of the IEEE, 98(8):1391-1407, 2010.


  • M. Everingham, J. Sivic, and A. Zisserman. Taking the Bite out of Automated Naming of Characters in TV Video. Image and Video Computing, 27(5):545-559, 2009.


  • J. Sivic and A. Zisserman. Efficient visual search of videos cast as text retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4):591-606, 2009.


  • J. Sivic and A. Zisserman. Efficient Visual Search for Objects in Videos. Proceedings of the IEEE, 96(4):548-566, 2008.


Conference articles

  • Antoine Yang, Arsha Nagrani, Ivan Laptev, Josef Sivic, and Cordelia Schmid. VidChapters-7M: Video Chapters at Scale. In NeurIPS, 2023.


  • Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, and Cordelia Schmid. Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. In CVPR, 2023.


  • L. Montaut, Q. Le Lidec, V. Petrik, J. Sivic, and J. Carpentier. Collision Detection Accelerated: An Optimization Perspective. In RSS, 2022.


  • T. Soucek, J.-B. Alayrac, A. Miech, I. Laptev, and J. Sivic. Look for the Change: Learning Object States and State-Modifying Actions From Untrimmed Web Videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. TubeDETR: Spatio-Temporal Video Grounding with Transformers. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Zero-Shot Video Question Answering via Frozen Bidirectional Language Models. In NeurIPS, 2022.


  • Y. Labbé, J. Carpentier, M. Aubry, and J. Sivic. Single-view robot pose and joint angle estimation via render & compare. In IEEE Conference on Computer Vision and Pattern Recognition, 2021.


  • A. Yang, A. Miech, J. Sivic, I. Laptev, and C. Schmid. Just Ask: Learning to Answer Questions from Millions of Narrated Videos. In International Conference on Computer Vision, 2021.


  • Y. Labbé, J. Carpentier, M. Aubry, and J. Sivic. CosyPose: Consistent multi-view multi-object 6D pose estimation. In European Conference on Computer Vision, 2020.


  • Antoine Miech, Jean-Baptiste Alayrac, Lucas Smaira, Ivan Laptev, Josef Sivic, and Andrew Zisserman. End-to-end learning of visual representations from uncurated instructional videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.


  • Ignacio Rocco, Relja Arandjelovic, and Josef Sivic. Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions. In European Conference on Computer Vision, 2020.


  • Robin Strudel, Alexander Pashevich, Igor Kalevatykh, Ivan Laptev, Josef Sivic, and Cordelia Schmid. Learning to combine primitive skills: A step towards versatile robotic manipulation. In International Conference on Robotics and Automation, 2020.


  • D. Zhukov, J.-B. Alayrac, I. Laptev, and J. Sivic. Learning Actionness via Long-range Temporal Order Verification. In European Conference on Computer Vision, 2020.


  • M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler. D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • Z. Li, J. Sedlar, J. Carpentier, I. Laptev, N. Mansard, and J. Sivic. Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, and Josef Sivic. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips. In International Conference on Computer Vision, 2019.


  • D. Zhukov, J.-B. Alayrac, R.G. Cinbis, D. Fouhey, I. Laptev, and J. Sivic. Cross-task weakly supervised learning from instructional videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.


  • I. Rocco, R. Arandjelovic, and J. Sivic. End-to-end weakly-supervised semantic alignment. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.


  • I. Rocco, M. Cimpoi, R. Arandjelovic, A. Torii, T. Pajdla, and J. Sivic. Neighbourhood Consensus Networks. In Advances in Neural Information Processing Systems, 2018.


  • T. Sattler, W. Maddern, C. Toft, A. Torii, L. Hammarstrand, E. Stenborg, D. Safari, M. Okutomi, M. Pollefeys, J. Sivic, F. Kahl, and T. Pajdla. Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.


  • H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii. InLoc: Indoor Visual Localization with Dense Matching and View Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.


  • J.-B. Alayrac, J. Sivic, I. Laptev, and S. Lacoste-Julien. Joint Discovery of Object States and Manipulation Actions. In International Conference on Computer Vision, 2017.


  • A. Miech, J.-B. Alayrac, P. Bojanowski, I. Laptev, and J. Sivic. Learning from Video and Text via Large-Scale Discriminative Clustering. In International Conference on Computer Vision, 2017.


  • J. Peyre, I. Laptev, C. Schmid, and J. Sivic. Weakly-supervised learning of visual relations. In International Conference on Computer Vision, 2017.


  • I. Rocco, R. Arandjelovic, and J. Sivic. Convolutional neural network architecture for geometric matching. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.


  • J.-B. Alayrac, P. Bojanowski, N. Agrawal, I. Laptev, J. Sivic, and S. Lacoste-Julien. Unsupervised Learning from Narrated Instruction Videos. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.


  • R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.


  • V. Chari, S. Lacoste-Julien, I. Laptev, and J. Sivic. On Pairwise Costs for Network Flow Multi-Object Tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.


  • S. Lee, N. Maisonneuve, D. Crandall, A. Efros, and J. Sivic. Linking Past to Present: Discovering Style in Two Centuries of Architecture. In International Conference on Computational Photography, 2015.


  • M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Is object localization for free? -- Weakly-supervised learning with convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.


  • A. Torii, R. Arandjelovic, J. Sivic, T. Pajdla, and M. Okutomi. 24/7 place recognition by view synthesis. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.


  • M. Aubry, D. Maturana, A. Efros, B. Russell, and J. Sivic. Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models. In IEEE Conference on Computer Vision and Pattern Recognition, 2014.


  • P. Bojanowski, R. Lajugie, F. Bach, I. Laptev, J. Ponce, C. Schmid, and J. Sivic. Weakly Supervised Action Labeling in Videos Under Ordering Constraints. In European Conference on Computer Vision, 2014.


  • M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2014.


  • T.-H. Vu, C. Olsson, I. Laptev, A. Oliva, and J. Sivic. Predicting Actions from Static Scenes. In European Conference on Computer Vision, 2014.


  • K. Alahari, G. Seguin, J. Sivic, and I. Laptev. Pose Estimation and Segmentation of People in 3D Movies. In International Conference on Computer Vision, 2013.


  • P. Bojanowski, F. Bach, I. and Laptev, J. Ponce, C. Schmid, and J. Sivic. Finding Actors and Actions in Movies. In International Conference on Computer Vision, 2013.


  • P. Gronat, G. Obozinski, J. Sivic, and T Pajdla. Learning per-location classifiers for visual place recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2013.


  • A. Torii, J. Sivic, T. Pajdla, and M. Okutomi. Visual Place Recognition with Repetitive Structures. In IEEE Conference on Computer Vision and Pattern Recognition, 2013.


  • V. Delaitre, D. Fouhey, I. Laptev, J. Sivic, A. Gupta, and A. Efros. Scene semantics from long-term observation of people. In European Conference on Computer Vision, 2012.


  • D. Fouhey, V. Delaitre, A. Gupta, A. Efros, I. Laptev, and J. Sivic. People Watching: Human Actions as a Cue for Single-View Geometry. In European Conference on Computer Vision, 2012.


  • V. Delaitre, J. Sivic, and I. Laptev. Learning person-object interactions for action recognition in still images. In Advances in Neural Information Processing Systems, 2011.


  • J. Lezama, K. Alahari, J. Sivic, and I. Laptev. Track to the Future: Spatio-temporal Video Segmentation with Long-range Motion Cues. In IEEE Conference on Computer Vision and Pattern Recognition, 2011.


  • M. Rodriguez, I. Laptev, J. Sivic, and J.-Y. Audibert. Density-aware person detection and tracking in crowds. In International Conference on Computer Vision, 2011.


  • M. Rodriguez, J. Sivic, I. Laptev, and J.-Y. Audibert. Data-driven Crowd Analysis. In International Conference on Computer Vision, 2011.


  • B. C. Russell, J. Sivic, J. Ponce, and H. Dessales. Automatic Alignment of Paintings and Photographs Depicting a 3D Scene. In 3rd International IEEE Workshop on 3D Representation for Recognition (3dRR-11), with ICCV 2011, 2011.


  • A. Torii, J. Sivic, and T. Pajdla. Visual localization by linear combination of image descriptors. In Proceedings of the 2nd IEEE Workshop on Mobile Vision, with ICCV 2011, 2011.


  • O. Whyte, J. Sivic, and A. Zisserman. Deblurring Shaken and Partially Saturated Images. In Proceedings of the IEEE Workshop on Color and Photometry in Computer Vision, with ICCV 2011, 2011.


  • N. Cherniavsky, I. Laptev, J. Sivic, and A. Zisserman. Semi-supervised learning of facial attributes in video. In The first international workshop on parts and attributes (in conjunction with ECCV 2010), 2010.


  • V. Delaitre, I. Laptev, and J. Sivic. Recognizing human actions in still images: a study of bag-of-features and part-based representations. In British Machine Vision Conference, 2010. Note: Updated version, available at http://www.di.ens.fr/willow/research/stillactions/.


  • B. Kaneva, J. Sivic, A. Torralba, S. Avidan, and W. T. Freeman. Matching and Predicting Street Level Images. In ECCV 2010 Workshop on Vision for Cognitive Tasks, 2010.


  • J. Knopp, J. Sivic, and T. Pajdla. Avoiding confusing features in place recognition. In European Conference on Computer Vision, 2010.


  • J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Descriptor learning for efficient retrieval. In European Conference on Computer Vision, 2010.


  • O. Whyte, J. Sivic, A. Zisserman, and J. Ponce. Non-uniform Deblurring for Shaken Images. In IEEE Conference on Computer Vision and Pattern Recognition, 2010.


  • O. Duchenne, I. Laptev, J. Sivic, F. Bach, and J. Ponce. Automatic Annotation of Human Actions in Video. In International Conference on Computer Vision, 2009.


  • B.C. Russell, A. Efros, J. Sivic, W.T. Freeman, and A. Zisserman. Segmenting Scenes by Matching Image Composites. In , 2009.


  • J. Sivic, M. Everingham, and A. Zisserman. ``Who are you?'': Learning person specific classifiers from video. In IEEE Conference on Computer Vision and Pattern Recognition, 2009.


  • O. Whyte, J. Sivic, and A. Zisserman. Get Out of my Picture! Internet-based Inpainting. In British Machine Vision Conference, 2009.


  • C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman. SIFT Flow: Dense Correspondence across Different Scenes. In European Conference on Computer Vision, 2008.


  • James Philbin, Ondrej Chum, Michael Isard, J. Sivic, and A. Zisserman. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.


  • James Philbin, J. Sivic, and A. Zisserman. Geometric LDA: A Generative Model for Particular Object Discovery. In British Machine Vision Conference, 2008.


  • J. Sivic, Biliana Kaneva, A. Torralba, Shai Avidan, and William T. Freeman. Creating and Exploring a Large Photorealistic Virtual Space. In Proceedings of the First IEEE Workshop on Internet Vision, 2008.


  • J. Sivic, Bryan C. Russell, A. Zisserman, William T. Freeman, and Alyosha A. Efros. Unsupervised discovery of visual object class hierarchies. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.



Back to publication index




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.




This document was translated from BibTEX by bibtex2html