Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19)
In this paper, we introduce a new method to automatically reconstruct the 3D motion of a person interacting with an object from a single RGB video. Our method estimates the 3D poses of the person and the object, contact positions and forces, and torques actuated by the human limbs. The main contributions of this work are three-fold. First, we propose an approach to jointly estimate the motion and the actuation forces of the person on the manipulated object by modeling contacts and the dynamics of their interactions. This is cast as a large-scale trajectory optimization problem. Second, we suggest a method to automatically recognize from the input video the position and timing of contacts between the person and the object or the ground, thereby significantly simplifying the complexity of the optimization. We validate our approach on a recent MoCap dataset with ground truth contact forces and demonstrate on a new dataset of Internet videos showing people manipulating a variety of tools in unconstrained indoor/outdoor environments.
We warmly thank Bruno Watier (Université Paul Sabatier and LAAS-CNRS) and Galo Maldonado (ENSAM ParisTech) for setting up the Parkour dataset. This work was partly supported by the ERC grant LEAP (No.336845), the H2020 Memmo project, CIFAR Learning in Machines&Brains program, and the European Regional Development Fund under the project IMPACT (reg. no. CZ.02.1.01/0.0/0.0/15_003/0000468).
The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright.