Ivan Laptev

Professor at MBZUAI
on leave from INRIA Paris
Email: Ivan.Laptev -at- mbzuai.ac.ae



Short Bio:
Ivan Laptev is a professor at MBZUAI on leave from INRIA Paris and a head of research at VisionLabs. He received a PhD degree in Computer Science from the Royal Institute of Technology in 2004 and a Habilitation degree from Ecole Normale Superieure in 2013. Ivan's main research interests include visual recognition of human actions, objects and interactions, and more recently robotics. He has published over 150 technical papers most of which appeared in international journals and major peer-reviewed conferences of the field. He served as an associate editor of IJCV and TPAMI, he has served as a program chair for ICCV'23 and CVPR'18, he will serve as a program chair for ACCV'24 and is a regular area chair for CVPR, ICCV and ECCV. He has co-organized several tutorials, workshops and challenges at major computer vision conferences. He has also co-organized a series of INRIA summer schools on computer vision and machine learning (2010-2013) and Machines Can See summits (2017-2023). He received an ERC Starting Grant in 2012 and was awarded a Helmholtz prize in 2017.


Students     Publications     CV     GS

News





Students



Alumni



Publications

VidChapters-7M: Video Chapters at Scale (2023),
A. Yang, A. Nagrani, I. Laptev, J. Sivic and C. Schmid;
in Proc NeurIPS'23.
Project page


PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation (2023),
S. Chen, R. Garcia, C. Schmid, I. Laptev;
in Proc CoRL'23.
Project page


Object Goal Navigation with Recursive Implicit Maps (2023),
S. Chen, T. Chabal, I. Laptev and C. Schmid;
In Proc. IROS'23.
Project page


Robust visual sim-to-real transfer for robotic manipulation (2023),
R. Garcia, R. Strudel, S. Chen, E. Arlaud, I. Laptev and C. Schmid;
In Proc. IROS'23.
Project page


Tackling ambiguity with images: Improved multimodal machine translation and contrastive evaluation (2023),
M. Futeral, C. Schmid, I. Laptev, B. Sagot and R. Bawden;
in Proc ACL'23.
Project page, CoMMuTE dataset


Vid2Seq: Large-scale pretraining of a visual language model for dense video captioning (2023),
A. Yang, A. Nagrani, P.H. Seo, A. Miech, J. Pont-Tuset, I. Laptev, J. Sivic and C. Schmid;
in Proc CVPR'23.
Project page


gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction (2023),
Z. Chen, S. Chen, C. Schmid and I. Laptev;
in Proc CVPR'23.
Project page


Learning Video-Conditioned Policies for Unseen Manipulation Tasks (2023),
E. Chane-Sane, C. Schmid and I. Laptev;
in Proc. ICRA'23.
Project page


Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control (2023),
Q. Le Lidec, W. Jallet, I. Laptev, C. Schmid and J. Carpentier;
in Proc. ICRA'23.


Language Conditioned Spatial Relation Reasoning for 3D Object Grounding (2022),
S. Chen, P.-L. Guhur, M. Tapaswi, C. Schmid and I. Laptev;
in Proc NeurIPS'22.
Project page


Zero-Shot Video Question Answering via Frozen Bidirectional Language Models (2022),
A. Yang, A. Miech, J. Sivic, I. Laptev and C. Schmid;
in Proc NeurIPS'22.
Project page


Instruction-driven history-aware policies for robotic manipulations (2022),
P.-L. Guhur, S. Chen, R. Garcia, M. Tapaswi, I. Laptev and C. Schmid;
in Proc CoRL'22.
Project page


AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction (2022),
Z. Chen, Y. Hasson, C. Schmid and I. Laptev;
in Proc ECCV'22.
Project page


Learning from Unlabeled 3D Environments for Vision-and-Language Navigation (2022),
S. Chen, P.-L. Guhur, M. Tapaswi, C. Schmid and I. Laptev;
in Proc ECCV'22.
Project page


Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation (2022),
S. Chen, P.-L. Guhur, M. Tapaswi, C. Schmid and I. Laptev;
in Proc CVPR'22.
Project page


TubeDETR: Spatio-Temporal Video Grounding with Transformers (2022),
A. Yang, A. Miech, J. Sivic, I. Laptev and C. Schmid;
in Proc CVPR'22.
Project page


Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos (2022),
T. Souček, J.-B. Alayrac, A. Miech, I. Laptev and J. Sivic;
in Proc CVPR'22.
Project page


Estimating 3D Motion and Forces of Human-Object Interactions from Internet Videos (2022),
Z. Li, J. Sedlar, J. Carpentier, I. Laptev, N. Mansard and J. Sivic;
in IJCV 2022.
Project page


Towards unconstrained joint hand-object reconstruction from RGB videos (2021),
Y. Hasson, G.. Varol, I. Laptev and C. Schmid;
In Proc. 3DV, virtual.
Project page


History Aware Multimodal Transformer for Vision-and-Language Navigation (2021),
S. Chen, P.-L. Guhur, C. Schmid and I. Laptev;
in Proc. NeurIPS'21, virtual.
Project page


Differentiable rendering with perturbed optimizers (2021), `
Q. Le Lidec, I. Laptev, C. Schmid and J. Carpentier;
in Proc. NeurIPS'21, virtual.


XCiT: Cross-Covariance Image Transformers (2021),
A. El-Nouby, H. Touvron, M. Caron, P. Bojanowski, M. Douze, A. Joulin, I. Laptev, N. Neverova, G. Synnaeve, J. Verbeek and H. Jégou;
in Proc. NeurIPS'21, virtual.
Project page


Segmenter: Transformer for Semantic Segmentation (2021),
R. Strudel, R. Garcia, I. Laptev and C. Schmid;
in Proc. ICCV'21, virtual.
Project page


Airbert: In-domain Pretraining for Vision-and-Language Navigation (2021),
P.-L. Guhur, M. Tapaswi, S. Chen, I. Laptev and C. Schmid;
in Proc. ICCV'21, virtual.
Project page


Just Ask: Learning to Answer Questions from Millions of Narrated Videos (2021),
A. Yang, A. Miech, J. Sivic, I. Laptev and C. Schmid;
in Proc. ICCV'21, virtual.
Project page


Goal-Conditioned Reinforcement Learning with Imagined Subgoals (2021),
E. Chane-Sane, C. Schmid and I. Laptev;
in Proc ICML'21.
Project page


Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers (2021),
A. Miech, J.-B. Alayrac, I. Laptev, J. Sivic and A. Zisserman;
in Proc CVPR'21.


Differentiable simulation for physical system identification (2021),
Q. Le Lidec, I. Kalevatykh, I. Laptev, C. Schmid, and J. Carpentier;
in IEEE RAL 2021.


Synthetic Humans for Action Recognition from Unseen Viewpoints (2021),
G. Varol, I. Laptev, C. Schmid and A. Zisserman;
in IJCV 2021.
Project page


Long term spatio-temporal modeling for action detection (2021),
M. Tapaswi, V. Kumar and I. Laptev;
in CVIU 2021.


Learning Obstacle Representations for Neural Motion Planning (2020),
R. Strudel, R. Garcia, J. Carpentier, J.-P. Laumond, I. Laptev and C. Schmid;
In Proc. CoRL'20.
Project page


Learning Object Manipulation Skills via Approximate State Estimation from Real Videos (2020),
V. Petrík, M. Tapaswi, I. Laptev and J. Sivic;
In Proc. CoRL'20.



Learning visual policies for building 3D shape categories (2020),
A. Pashevich*, I. Kalevatykh*, I. Laptev and C. Schmid;
In Proc. IROS'20, Las Vegas, NV, USA.
Project page


Learning Actionness via Long-range Temporal Order Verication (2020),
D. Zhukov, J.-B. Alayrac, I. Laptev and J. Sivic;
In Proc. ECCV'20, Glasgow, UK.
Project page


End-to-End Learning of Visual Representations from Uncurated Instructional Videos (2020),
A. Miech*, J.-B. Alayrac*, L. Smaira, I. Laptev, J. Sivic and A. Zisserman;
In Proc. CVPR'20, Seattle, WA, USA.
Project page, YouCook2 zero-shot search demo, I3D model, S3D model.


Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction (2020),
Y. Hasson, B. Tekin, F. Bogo, I. Laptev, M. Pollefeys and C. Schmid;
In Proc. CVPR'20, Seattle, WA, USA.
Project page


Learning Interactions and Relationships between Movie Characters (2020),
A. Kukleva, M. Tapaswi and I. Laptev;
In Proc. CVPR'20, Seattle, WA, USA.


Action Modifiers: Learning from Adverbs in Instructional Videos (2020),
H. Doughty, I. Laptev, W. Mayol-Cuevas and D. Damen;
In Proc. CVPR'20, Seattle, WA, USA.
Project page


Learning to combine primitive skills: A step towards versatile robotic manipulation (2020),
R. Strudel*, A. Pashevich*, I. Kalevatykh, I. Laptev, J. Sivic and C. Schmid;
In Proc. ICRA'20, Paris, France.


Monte-Carlo Tree Search for Efficient Visually Guided Rearrangement Planning (2020),
Y. Labbé, S. Zagoruyko, I. Kalevatykh, I. Laptev, J. Carpentier, M. Aubry and J. Sivic;
In IEEE Robotics and Automation Letters, Vol. 5, No. 2, April 2020.


HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips (2019),
A. Miech, D. Zhukov, J.-B. Alayrac, M. Tapaswi, I. Laptev and J. Sivic;
in Proc. ICCV'19, Seoul, South Korea.
Project page


Detecting unseen visual relations using analogies (2019),
J. Peyre, J. Sivic, I. Laptev and C. Schmid;
in Proc. ICCV'19, Seoul, South Korea.


Learning to Augment Synthetic Images for Sim2Real Policy Transfer (2019),
A. Pashevich, R. Strudel, I. Kalevatykh, I. Laptev and C. Schmid;
in Proc. IROS'19, Macau, China.
Project page


Learning joint reconstruction of hands and manipulated objects (2019),
Y. Hasson, G. Varol, D. Tzionas, I. Kalevatykh, M. Black, I. Laptev and C. Schmid;
in Proc. CVPR'19, Long Beach, CA, USA.
Project page


Cross-task weakly supervised learning from instructional videos (2019),
D. Zhukov, J.-B. Alayrac, R.G. Cinbis, D. Fouhey, I. Laptev and J. Sivic;
in Proc. CVPR'19, Long Beach, CA, USA.
Project page


Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video (2019),
Z. Li, J. Sedlar, J. Carpentier, I. Laptev, N. Mansard and J. Sivic;
in Proc. CVPR'19, Long Beach, CA, USA.
Project page


Deep Metric Learning Beyond Binary Supervision (2019),
S. Kim, M. Seo, I. Laptev, M. Cho and S. Kwak;
in Proc. CVPR'19, Long Beach, CA, USA.


A Flexible Model for Training Action Localization with Varying Levels of Supervision (2018),
G. Chéron*, J.-B. Alayrac*, I. Laptev and C. Schmid; in Proc. NIPS'18, Montreal, Canada. (* indicates equal contribution)
Project page (coming soon)


Learning a Text-Video Embedding from Incomplete and Heterogeneous Data (2018),
A. Miech, I. Laptev and J. Sivic; arXiv preprint arXiv:1806.11328.
Project page


BodyNet: Volumetric Inference of 3D Human Body Shapes (2018),
G. Varol, D. Ceylan, B. Russell, J. Yang, E. Yumer, I. Laptev and C. Schmid; in Proc. ECCV'18, Munich, Germany.
Project page


Joint Discovery of Object States and Manipulation Actions (2017),
J.-B. Alayrac, J. Sivic, I. Laptev and S. Lacoste-Julien.; in Proc. ICCV'17, Venice, Italy.
Project page


Weakly-Supervised Learning of Visual Relations (2017),
J. Peyre, J. Sivic, I. Laptev and C. Schmid; in Proc. ICCV'17, Venice, Italy.
Project page


Learning From Video and Text via Large-Scale Discriminative Clustering (2017),
A. Miech, J.-B. Alayrac, P. Bojanowski, I. Laptev and J. Sivic; in Proc. ICCV'17, Venice, Italy.
Project page


Learning from Synthetic Humans (2017),
G. Varol, J. Romero, X. Martin, N. Mahmood, M.J. Black, I. Laptev and C. Schmid; in Proc. CVPR'17, Honolulu, Hawaii.
Project page


Learnable pooling with Context Gating for video classification (2017),
A. Miech, I. Laptev and J. Sivic; arXiv preprint arXiv:1706.06905.
Code


Long-term Temporal Convolutions for Action Recognition (2017),
G. Varol, I. Laptev and C. Schmid; in IEEE Trans. on Pattern Analysis and Machine Intelligence 2017.
Project page


The THUMOS Challenge on Action Recognition for Videos "in the Wild" (2017),
H. Idrees, A.R. Zamir, Y.-G. Jiang, A. Gorban, I. Laptev, R. Sukthankar and M. Shah; in Computer Vision and Image Understanding, 155, pp.1-23.
Thumos Challenge


ContextLocNet: Context-aware deep network models for weakly supervised localization (2016),
V. Kantorov, M. Oquab, M. Cho and I. Laptev; in Proc. ECCV'16, Amsterdam, The Netherlands.
Project page


Hollywood in homes: Crowdsourcing data collection for activity understanding (2016),
G. Sigurdsson, G. Varol, X. Wang, A. Farhadi, I. Laptev and A. Gupta; in Proc. ECCV'16, Amsterdam, The Netherlands.
Project page


Much ado about time: Exhaustive annotation of temporal data (2016),
G. Sigurdsson, O. Russakovsky, A. Farhadi, I. Laptev and A. Gupta; in Proc. HCOMP'16, Austin, TX, USA.
Project page


Unsupervised learning from narrated instruction videos (2016),
J.-B. Alayrac, P. Bojanowski, N. Agrawal, J. Sivic, I. Laptev and S. Lacoste-Julien; in Proc. CVPR'16, Las Vegas, USA.
Project page
Extended version:
Learning from Narrated Instruction Videos (2017),
J.-B. Alayrac, P. Bojanowski, N. Agrawal, J. Sivic, I. Laptev and S. Lacoste-Julien; in IEEE Trans. on Pattern Analysis and Machine Intelligence.


Instance-level video segmentation from object tracks (2016),
G. Seguin, P. Bojanowski, R. Lajugie and I. Laptev; in Proc. CVPR'16, Las Vegas, USA.
Project page


Thin-slicing for pose: Learning to understand pose without explicit pose estimation (2016),
S. Kwak, M. Cho, I. Laptev; in Proc. CVPR'16, Las Vegas, USA.


P-CNN: Pose-based CNN Features for Action Recognition (2015), Project page
G. Chéron, I. Laptev and C. Schmid; in Proc. ICCV'15, Santiago, Chile.

Context-aware CNNs for person head detection (2015), Project page
T.-H. Vu, A. Osokin and I. Laptev; in Proc. ICCV'15, Santiago, Chile.

Weakly-Supervised Alignment of Video With Text (2015),
P. Bojanowski, R. Lajugie, E. Grave, F. Bach, I. Laptev, J. Ponce and C. Schmid; in Proc. ICCV'15, Santiago, Chile.

Unsupervised Object Discovery and Tracking in Video Collections (2015),
S. Kwak, M. Cho, I. Laptev, J. Ponce, and C. Schmid; in Proc. ICCV'15, Santiago, Chile.

Is object localization for free? - Weakly-supervised learning with convolutional neural networks (2015), Project page
M. Oquab, L. Bottou, I. Laptev and J. Sivic; in Proc. CVPR'15, Boston, Massachusetts, USA.

On Pairwise Costs for Network Flow Multi-Object Tracking (2015), Project page
V. Chari, S. Lacoste-Julien, I. Laptev and J. Sivic; in Proc. CVPR'15, Boston, Massachusetts, USA.

Predicting Actions from Static Scenes (2014), Project page
T.-H. Vu, C. Olsson, I. Laptev, A. Oliva and J. Sivic; in Proc. ECCV'14, Zurich, Switzerland.

Weakly supervised action labeling in videos under ordering constraints (2014), Project page
P. Bojanowski, R. Lajugie, F. Bach, I. Laptev, J. Ponce, C. Schmid and J. Sivic; in Proc. ECCV'14, Zurich, Switzerland.

Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks (2014), Project page
M. Oquab, L. Bottou, I. Laptev and J. Sivic; in Proc. CVPR'14, Columbus, Ohio, USA.
Earlier version: Technical Report HAL-00911179, Nov. 2013.

Efficient feature extraction, encoding and classification for action recognition (2014) Project page
V. Kantorov, I. Laptev in Proc. CVPR'14, Columbus, Ohio, USA.

Finding Actors and Actions in Movies (2013), Project page
P. Bojanowski, F. Bach, I. Laptev, J. Ponce, C. Schmid and J. Sivic; in Proc. ICCV'13, Sydney, Australia.

Pose Estimation and Segmentation of People in 3D Movies (2013), Project page
G. Seguin K. Alahari, J. Sivic and I. Laptev; in Proc. ICCV'13, Sydney, Australia.
Extended version:
Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies (2015),
G. Seguin, K. Alahari, J. Sivic and I. Laptev; in IEEE Trans. on Pattern Analysis and Machine Intelligence, 37(8):1643-1655.

Scene semantics from long-term observation of people (2012), Project page
V. Delaitre, D.F. Fouhey, I. Laptev, J. Sivic, A. Gupta and A.A. Efros; in Proc. ECCV'12, Florence, Italy.

People Watching: Human Actions as a Cue for Single-View Geometry (2012), Project page
D.F. Fouhey, V. Delaitre, A. Gupta, A.A. Efros, I. Laptev and J. Sivic; in Proc. ECCV'12, Florence, Italy.
Extended version:
People Watching: Human Actions as a Cue for Single View Geometry (2014),
D. Fouhey, V. Delaitre, A. Gupta, A. Efros, I. Laptev and J. Sivic; in International Journal of Computer Vision, 110(3):259-274.

Object Detection Using Strongly-Supervised Deformable Part Models (2012), Project page
H. Azizpour and I. Laptev; in Proc. ECCV'12, Florence, Italy.

Actlets: A novel local representation for human action recognition in video (2012),
M.M. Ullah and I. Laptev; in Proc. ICIP'12, Orlando, Florida, USA.

Learning person-object interactions for action recognition in still images (2011)
V. Delaitre, J. Sivic and I. Laptev; in Proc. NIPS'11, Granada, Spain.

Density-aware person detection and tracking in crowds (2011), Video, Project page
M. Rodriguez, I. Laptev, J. Sivic and J.-Y. Audibert; in Proc. ICCV'11, Barcelona, Spain.

Data-driven Crowd Analysis in Videos (2011), Video, Project page
M. Rodriguez, J. Sivic, I. Laptev and J.-Y. Audibert; in Proc. ICCV'11, Barcelona, Spain.

Track to the Future: Spatio-temporal Video Segmentation with Long-range Motion Cues (2011), Project page
J. Lezama, K. Alahari, J. Sivic and I. Laptev; in Proc. CVPR'11, Colorado, US.

Semi-supervised learning of facial attributes in video (2010), Project page
N. Cherniavsky, I. Laptev, J. Sivic, and A. Zisserman in The First Int. Workshop on Parts and Attributes (in conjunction with ECCV 2010), Greece.

Recognizing human actions in still images: a study of bag-of-features and part-based representations (2010), Project page
V. Delaitre, I. Laptev end J. Sivic in Proc. BMVC'10, Aberystwyth, UK.

Improving Bag-of-Features Action Recognition with Non-local Cues (2010),
M.M. Ullah, S.N. Parizi and I. Laptev; in Proc. BMVC'10, Aberystwyth, UK.

Automatic Annotation of Human Actions in Video (2009),
O. Duchenne, I. Laptev, J. Sivic, F. Bach and J. Ponce; in Proc. ICCV'09, Kyoto, Japan.

Evaluation of local spatio-temporal features for action recognition (2009),
H. Wang, M. M. Ullah, A. Klaser, I. Laptev and C. Schmid; in Proc. BMVC'09, London, UK.

Multi-View Synchronization of Human Actions and Dynamic Scenes (2009),
E. Dexter, P. Perez and I. Laptev; in Proc. BMVC'09, London, UK.

Actions in Context (2009),
M. Marszałek, I. Laptev and C. Schmid; in Proc. CVPR'09, Miami, US.

Modeling Image Context using Object Centered Grids (2009),
S.N. Parizi, I. Laptev and A.T. Targhi; in Proc. DICTA'09, Melbourne, Australia.

Cross-View Action Recognition from Temporal Self-Similarities (2008),
I. Junejo, E. Dexter, I. Laptev and Patrick Perez; in Proc. ECCV'08, Marseille, France.
Extended version:
View-Independent Action Recognition from Temporal Self-Similarities (2010),
I. Junejo, E. Dexter, I. Laptev and P. Perez; in IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):172-185.

Learning realistic human actions from movies (2008),
I. Laptev, M. Marszałek, C. Schmid and B. Rozenfeld; in Proc. CVPR'08, Anchorage, US.

Retrieving actions in movies (2007),
I. Laptev and P. Perez; in Proc. ICCV'07, Rio de Janeiro, Brazil.

Video Copy Detection: a Comparative Study (2007),
J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa and F.I. Stentiford; in Proc. CIVR'07, Amsterdam, The Netherlands, pp. 371-378.

Improvements of Object Detection Using Boosted Histograms (2006),
I. Laptev; in Proc. BMVC'06, Edinburgh, UK, pp. III:949-958.
Extended version:
Improving Object Detection with Boosted Histograms (2009),
I. Laptev; in Image and Vision Computing, vol. 27, issue 5, pp. 535-544.

Periodic Motion Detection and Segmentation via Approximate Sequence Alignment (2005),
I. Laptev, S.J. Belongie, P. Perez and J. Wills; in Proc. ICCV'05, Bijing, China, pp. I:816-823.

Local Descriptors for Spatio-Temporal Recognition (2004),
I. Laptev and T. Lindeberg; in ECCV Workshop "Spatial Coherence for Visual Motion Analysis", Springer LNCS Vol.3667, pp. 91-103.
Extended version:
Local Velocity-Adapted Motion Events for Spatio-Temporal Recognition (2007),
I. Laptev, B. Caputo, C. Schuldt and T. Lindeberg; in Computer Vision and Image Understanding, 108:207-229.

Velocity adaptation of space-time interest points (2004),
I. Laptev and T. Lindeberg; in Proc. ICPR'04, Cambridge, UK, pp.I:52-56.

Galilean-diagonalized spatio-temporal interest operators (2004),
T. Lindeberg, A. Akbarzadeh and I. Laptev; in Proc. ICPR'04, Cambridge, UK, pp.I:57-62.

Recognizing Human Actions: A Local SVM Approach (2004),
Christian Schuldt, Ivan Laptev and Barbara Caputo; in Proc. ICPR'04, Cambridge, UK, pp.III:32--36.

Space-Time Interest Points (2003),
I. Laptev and T. Lindeberg; in Proc. ICCV'03, Nice, France, pp.I:432-439.
Extended version:
On Space-Time Interest Points (2005),
I. Laptev; in International Journal of Computer Vision, vol 64, number 2/3, pp.107-123.

Interest point detection and scale selection in space-time (2003),
I. Laptev and T. Lindeberg; in Proc. Scale Space Methods in Computer Vision, Isle of Skye, UK, Springer LNCS vol.2695, pp.372-387.

Velocity-adaptation of spatio-temporal receptive fields for direct recognition of activities: An experimental study (2002),
I. Laptev and T. Lindeberg; in Proc. ECCV'02 Workshop on Statistical Methods in Video Processing, pp.61-66.
Extended version:
Velocity-adaptation of spatio-temporal receptive fields for direct recognition of activities: An experimental study (2004),
I. Laptev and T. Lindeberg; in Image and Vision Computing 22:105-116.

Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering (2002),
L. Bretzner, I. Laptev and T. Lindeberg; in Proc. 5th IEEE International Conference on Automatic Face and Gesture Recognition, Washington D.C., May, pp.423-428.

Extraction of linear objects from interferometric SAR data (2002),
O. Hellwich, I. Laptev and H. Mayer; in Int. J. Remote Sensing 23(3):461-475, 2002.

A multi-scale feature likelihood map for direct evaluation of object hypotheses (2001),
I. Laptev and T. Lindeberg; in Proc. IEEE Workshop on Scale-Space and Morphology, Vancouver, Canada, Springer LNCS vol.2106, pp.98-110.
Extended version:
A distance measure and a feature likelihood map concept for scale-invariant model matching (2003),
I. Laptev and T. Lindeberg; in International Journal of Computer Vision, vol 52, number 2/3, pp 97-120.

Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features (2001),
I. Laptev and T. Lindeberg; in Proc. IEEE Workshop on Scale-Space and Morphology, Vancouver, Canada, Springer LNCS vol.2106, pp.98-110.

Agilo robocuppers: Robocup team description (1999),
M. Klupsch, M. Luckenhaus, C. Zierl, I. Laptev, T. Bandlow, M. Grimme, K. Kellerer, and F. Schwarzer; in RoboCup-98: Robot Soccer World Cup II, Springer LNCS vol.1604, pp.446-451.

Multi-Scale and Snakes for Automatic Road Extraction (1998),
H. Mayer, I. Laptev, A. Baumgartner; in Proc. ECCV'98, Freiburg, Germany, Springer LNCS vol.1406, pp.720-733.
Extended version:
Automatic extraction of roads from aerial images based on scale-space and snakes (2000),
I. Laptev, H. Mayer, T. Lindeberg, W. Eckstein, C. Steger, and A. Baumgartner; in Machine Vision and Applications 12(1):23-31.

Automatic Road Extraction Based on Multi-Scale Modelling, Context, and Snakes (1997),
H. Mayer, I. Laptev, A. Baumgartner, C. Steger; in International Archives of Photogrammetry and Remote Sensing, (32) 3-2W3 , pp.106-113