Seminars | Willow team

Igniting the Real Robot Revolution Requires Closing the “Data Gap”

June 27, 2025

Ken Goldberg

UC Berkeley, USA

AI is rapidly advancing the way we think but we live in a material world. We still need to move things, make things, and maintain things. Imagine a future where AI-driven robots handle billions of items to support an aging population that doesn’t have enough human workers. Several unicorn startups emerged in the past year to develop humanoid robots but to ignite the real robot revolution we need to close a 100,000x “Data Gap” between large vision-language models and current robot models. I propose stepping stones that will lead to general-purpose humanoid robots and review four options for generating robot data, including the most practical -- collecting data from real robots operating in real environments. I'll describe how Ambi Robotics has collected 200,000 hours of real robot data from their award-winning robot systems that have sorted 100 million real consumer packages. This data allows Ambi Robotics to close the data gap for a practical subclass of robot skills to enable a new generation of real industrial robots.

Short bio: Ken Goldberg is co-founder of Ambi Robotics and Jacobi Robotics and William S. Floyd Distinguished Chair of Engineering at UC Berkeley, where he leads research in robotics and automation: grasping, manipulation, and learning for applications in industry, homes, agriculture, and robot-assisted surgery. Ken is President of the Robot Learning Foundation and Chair of the Berkeley AI Research (BAIR) Lab Steering Committee. http://goldberg.berkeley.edu

Factor Graphs FTW: From SLAM to Control to Robot Dynamics

June 18, 2025

Frank Dellaert

Georgia Tech, USA

Factor graphs have proven highly effective in robot state estimation, especially in pose graph optimization (PGO), simultaneous localization and mapping (SLAM), and GPS-denied navigation. Less widely known is their utility as a general modeling tool for a variety of robotics problems, ranging from discrete planning and control to geometric algorithms for articulated robots. In this talk, I will give a tutorial introduction to factor graphs from a SLAM perspective, briefly explore their role in planning and control, and conclude with their application to robot dynamics.

Short bio: Frank Dellaert is a Professor in the School of Interactive Computing at Georgia Tech, where he has been on the faculty since 2001 after earning his Ph.D. from Carnegie Mellon. His research lies at the intersection of robotics and computer vision, with a focus on large-scale mapping, 3D reconstruction, and model-predictive control, often using graphical models. He leads development of the GTSAM toolbox (gtsam.org). Outside academia, Frank has held senior technical roles at several companies: Chief Scientist at Skydio (2015–2016), Technical Project Lead at Facebook’s Building 8 (2016–2018), Research Scientist at Google AI (2020–2022), and CTO at Verdant Robotics (2022–2024), where he now serves part-time as Chief AI Officer.

Robot Learning for Open-World Autonomy

June 4, 2025

Abhinav Valada

University Freiburg, Germany

A long-standing goal has been to create intelligent robots that can learn from the world around them to assist humans in everyday tasks, from domestic chores to transportation. However, most robots today are still tailored for specific tasks and controlled environments. Achieving truly ubiquitous robot autonomy requires learning methods that move beyond closed datasets and fixed policies, enabling robots to generalize and adapt online in open-world settings. In this talk, I will present our efforts toward learning open-world robot autonomy, from representations to actions, that enable robots to perform everyday tasks in real open-world environments. I will discuss our recent advancements in leveraging foundation models and continual online learning to common sense reasoning through language and vision. Finally, I will highlight our ongoing work on fairness in robot learning to ensure safe, trustworthy, and responsible innovation, which is crucial for the open world and for fostering acceptance in society.

Short bio: Abhinav Valada is a Full Professor at the University of Freiburg, where he directs the Robot Learning Lab. He is a member of the Department of Computer Science, the BrainLinks-BrainTools center, and a founding faculty of the ELLIS Unit Freiburg. Abhinav is a DFG Emmy Noether AI Fellow, Scholar of the ELLIS Society, and Chair of the IEEE Robotics and Automation Society Technical Committee on Robot Learning. He received his Ph.D. with distinction from the University of Freiburg and his M.S. in Robotics from The Robotics Institute of Carnegie Mellon University. Abhinav’s research lies at the intersection of robotics, machine learning, and computer vision with a focus on tackling fundamental robot perception, state estimation, and planning problems to enable robots to operate reliably in complex and diverse domains. For his research, he received the IEEE RAS Early Career Award in Robotics and Automation, IROS Toshio Fukuda Young Professional Award, NVIDIA Research Award, among others. Many aspects of his research have been prominently featured in wider media such as the Discovery Channel, NBC News, Business Times, and The Economic Times.

Spatial AI and emerging reasoning in end-to-end trained robotic navigation

June 4, 2025

Christian Wolf

Naver Labs Europe, France

An important sub goal of AI is the creation of intelligent agents, which require high-level reasoning capabilities, situation awareness, awareness of the dynamics of the environment, and the capacity of robustly taking the right decisions at the right moments. In this talk we will cover the automatic learning of reasoning capabilities through large-scale training of deep neural networks from data, and we target different tasks involving fast, precise and smooth navigation of terrestrial robots. We will present solutions and describe key features: reinforcement learning, identifying accurate dynamical models for usage in simulation, and the inclusion of geometric foundation models. We also present an in-depth analysis of the type of reasoning emerging in end-to-end trained agents. In particular, we study the presence of realistic dynamics which the agents learned for open-loop forecasting, and their interplay with sensing; the way the agents use latent memory to hold elements of the scene structure; and finally, their planning capabilities. Put together, we present experiments which paint a new picture on how using tools from computer vision and sequential decision making have led to new capabilities in robotics and control. We will also showcase the fleet of autonomous robots operated by Naver Labs Korea in Seoul in the world's first robot friendly building.

Short bio: Christian Wolf is Principal Scientist at Naver Labs Europe, where he leads the Spatial AI team. He is interested in AI for Robotics, in particular machine learning and embodied computer vision; large-scale learning of the capacity to perform high-level reasoning from visual observations, and the connections between machine learning and control. He is a member of the directing committee of GDR ISIS and co-leader of it's topic Machine Learning. He has supervised 18 defended PhD theses, is an associate editor of IEEE-Transactions on PAMI and area chair of NeurIPS (2020, 2021, 2023, 2024, 2025), ICLR (2021, 2023, 2024, 2025), ICML (2021, 2022), CVPR (2020, 2025), ICCV (2021, 2023) and ECCV (2022, 2024). From 2005 to 2021 he was associate professor (Maitre de Conferences, HDR) at INSA de Lyon and LIRIS, a CNRS laboratory, where he was also the head of the AI chair / chair in Artificial Intelligence (the group). He received his MSc in computer science from TU Vienna, Austria, in 2000, and a PhD in computer science from INSA de Lyon, France, in 2003. In 2012 he obtained the habilitation diploma, also from INSA de Lyon. In the past he was also member of the scientific committee of GDR IA; member of the board of AI experts at the French national supercomputing cluster GENCI, and member of evaluation ANR committee Artificial Intelligence from 2019-2021 and ANR committee Interaction and Robotics from 2016-2018.

The geometries of Lagrangian dynamics

April 17, 2025

Noémie Jaquier

KTH, Sweden

Lagrangian mechanics provides a powerful framework for modeling the dynamics of physical systems by inferring their motions based on energy conservation. This talk will explore recent advances in applying geometric perspectives, particularly Riemannian geometry, to Lagrangian principles for predicting and optimizing motion dynamics. First, I will discuss how the dynamic properties of humans and robots are straightforwardly accounted for by considering geometric configuration spaces. Second, I will show how this geometric approach can be extended to generate dynamic-aware, collision-free robot motions by modifying the underlying Riemannian metric. Finally, I will consider the problem of learning unknown high-dimensional Lagrangian dynamics. I will present a geometric architecture to learn physically-consistent and interpretable reduced-order dynamic parameters that accurately capture the behavior of the original system.

Short bio: Noémie Jaquier is an assistant professor at the Division of Robotics, Perception and Learning at the KTH Royal Institute of Technology. She received her PhD degree from the Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland in 2020. Prior to joining KTH, she was a postdoctoral researcher in the High Performance Humanoid Technologies Lab (H²T) at the Karlsruhe Institute of Technology (KIT) and a visiting postdoctoral scholar at the Stanford Robotics Lab. Her research investigates data-efficient and theoretically-sound learning algorithms that leverage differential geometry- and physics-based inductive bias to endow robots with close-to-human learning and adaptation capabilities.

RRTs and the Path to Minimalism: What Belongs in a Robot’s Brain?

December 3, 2024

Steve LaValle

University of Oulu, Finland

Imagine building a robot to accomplish one or more tasks, such as vacuuming, patrolling, or exploration. This talk considers an egocentric or situated view of theoretical robot development that takes into account its space of possible environments and specific tasks. How much does a robot need to sense and remember to successfully interact with its environment? This question is fundamental to robotics and distinguishes it from other fields such as computer science or control theory. If machine learning is the goal, then the question becomes what are the minimal, ideal structures that could possibly be learned? Thus, emphasis in this talk is placed on determining the minimal amount of information necessary to solve tasks, thereby giving the robot the smallest possible "brain". At one extreme, strong geometric information is sensed and encoded, leading to problems such as classical motion planning. On the path to minimalism, weak geometric information is considered in the form of combinatorial or relational sensing and filtering. Eventually, topological and set-based representations are considered at the minimalist extreme.

Minimal Supervision, Maximal Adaptation: Vision-Driven Learning for Scene Understanding and Action

October 23, 2024

Pia Bideau

Inria Grenoble (Thoth team), France

Finding the right abstractions of a complex high dimensional sensory input allows humans to flexibly adapt their behavior to changing environments. Effects can be seen in social cognition - e.g. context dependent perception of facial expressions, as well as in physical interactions, such as object manipulation where an agent’s movement has to constantly adapt to changing environments. The ability to adapt is a core faculty of human development, however remains an open challenge for AI agents, because they must first learn how to interact and then seek for information that they lack to adapt their behavior accordingly. My work aims at discovering and making use of information that arises by coupling vision and (inter)action. This opens a rich source of information, ranging from discovering physical relations between motion and image formation to learning abstract representations learned in interaction with one's environment. I believe information lying at the edge of computer vision and robotics, will not only help to develop more robust artificial vision systems, furthermore it will allow learning with only minimal human supervision. In this talk I will lay out my research, which may be categorized along these broad themes: (1) achieving scene understanding from motion, and (2) learning abstract representations to synthesize behavior (e.g. movement trajectories) in a context-dependent manner.

Geometric Learning: Leveraging Differential Geometry for Learning and Control

October 21, 2024

Bernardo Fichera

EPFL, Swiss

In this presentation, I will discuss two aspects of my Ph.D. research that apply differential geometry tools to enhance control strategies and probabilistic learning models. The first part introduces a novel method for learning non-linear dynamical systems for robotics control. Unlike traditional models where non-linearity stems from external forces, our approach derives non-linearity from the intrinsic curvature of the space itself. By learning the manifold’s (d+1)-dimensional Euclidean embedded representation, our method encodes the non-linearity within the curvature, preserving asymptotic stability to an equilibrium point regardless of spatial curvature. This geometry-based method not only improves learning efficiency and convergence speed but also sets the foundation for advanced configuration space learning and intrinsic robot dynamics. The second part focuses on extending Gaussian process regression to non-Euclidean spaces. Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this technique to higher dimensions is to leverage the implicit low-dimensional manifold upon which the data actually lies, as postulated by the manifold hypothesis. We propose a Gaussian process regression technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way. Our technique scales up to hundreds of thousands of data points, and improves the predictive performance and calibration of the standard Gaussian process regression in high dimensional settings as well as complex non-Euclidean spaces.

Towards Fast and Certifiable Nonconvex Optimal Control with Sparse Moment-SOS Relaxations

October 4, 2024

Heng Yang

Harvard, USA

Direct methods for optimal control, also known as trajectory optimization, is a workhorse for optimization-based control in robotics and beyond. Nonlinear programming with engineered initializations has been the de-facto approach for trajectory optimization, which however, can suffer from undesired local optimality. In this talk, I will first show that, using the machinery of sparse moment and sums-of-squares (SOS) relaxations, many nonconvex trajectory optimization problems can be solved to certifiable global optimality. That is, globally optimal solutions of the original nonconvex problems can be computed by solving convex semidefinite programs (SDPs) together with optimality certificates. I will then present a specialized SDP solver implemented in CUDA (C++) and runs in GPUs that exploits the structures of the problems to solve the convex SDPs at a scale far beyond existing solvers. Lastly, I will discuss several ongoing efforts in our group towards deploying the certifiable optimal control algorithms in real-world robots.

Short bio: Heng Yang is an Assistant Professor of Electrical Engineering in the School of Engineering and Applied Sciences (SEAS) at Harvard University. He directs the Computational Robotics Group, which focuses on the intersection of theory and practice, particularly in developing robust and efficient computational algorithms that enhance the performance of next-generation intelligent systems. Heng obtained his Ph.D. in Robotics from the Massachusetts Institute of Technology, where he collaborated with Luca Carlone in the Laboratory for Information and Decision Systems. He is also a part-time research scientist at NVIDIA Research.

Theoretical Understanding of Self-Supervised Learning

September 17, 2024

Yisen Wang

Peking University, China

Self-supervised learning (SSL) is an unsupervised approach for representation learning without relying on human-provided labels. It creates auxiliary tasks on unlabeled input data and learns representations by solving these tasks. SSL has demonstrated great success on various tasks. The existing SSL research mostly focuses on improving the empirical performance without a theoretical foundation. While the proposed SSL approaches are empirically effective on benchmarks, they are not well understood from a theoretical perspective. In this talk, I will introduce a series of our recent work on theoretical understanding of SSL, particularly on contrastive learning, masked autoencoders and autoregressive learning.

Short bio: Yisen Wang is an assistant professor at Peking University. His research interests include machine learning theory and algorithms, focusing on theoretical and algorithmic approaches for Large Language Models (Self Supervised/Weakly-Supervised Learning, In-context Learning, Length Generalization). He has published more than 50 top academic papers in the field of machine learning, including ICML, NeurIPS, ICLR, etc., and many of them have been selected as Oral or Spotlight. He has won the ECML 2021 Best Paper Award.

Constrained Structured Optimization: Formulations and Algorithms

September 11, 2024

Alberto De Marchi

UniBw, Munich

Mathematical and computational tools are ubiquitous nowadays, and optimization in various disciplines leads to problems in very different settings. In this talk we discuss finite-dimensional constrained structured programming in the fully nonconvex setting, capturing a variety of problems that include nonsmooth objectives, disjunctive structures, and nonlinear, set-membership constraints. The augmented Lagrangian framework is extended to cover this broad problem class, with established asymptotic properties and convergence guarantees under mild assumptions. Then, we illustrate a technique to avoid slack variables, treating them as implicit variables and taking advantage of certain oracles available. Finally, we will indicate the theoretical challenges that arise in this unexplored territory.

Short bio: Alberto De Marchi is a Postdoctoral Research Associate at the Institute of Applied Mathematics and Scientific Computing at the University of the Bundeswehr Munich, Germany, where he received his doctorate (Dr.rer.nat.) in 2021. In Fall 2022, he was a Visiting Research Associate at Curtin University, WA, Australia. He holds a master degree in Mechatronics Engineering (2016) and a bachelor degree in Industrial Engineering (2014) from the University of Trento, Italy. His scientific activity revolves around computational optimization, and focuses on the design, convergence analysis, numerical properties, and implementation of algorithms for mathematical optimization.

Leveraging morphological symmetries for robot dynamics modeling and control

September 6, 2024

Daniel Ordoñez

Istituto Italiano di Tecnologia (IIT), Italy

In this presentation, we explore the implications of morphological symmetries for modeling and controlling robotic systems using analytical and data-driven methods. These symmetries refer to structural properties of a robot's morphology, which introduce relevant geometric and algebraic biases that can be leveraged in optimization and machine learning. By employing group and representation theory, we will demonstrate how these symmetries influence the system's state space, equations of motion, generalized mass matrix, and optimal control policies, as well as proprioceptive and exteroceptive sensor data measurements. Lastly, we cover theoretical and practical applications of these concepts in robotics, including supervised, unsupervised, and reinforcement learning; legged locomotion and manipulation control; and Koopman operator-based dynamics modeling.

Learning through Extreme Visual Recovery

August 26, 2024

Cheng Xu

University of Sydney, Australia

Generative AI has transformed the creation of high-quality text, images, and videos, delivering impressive results. However, a decade ago, inferring data from its limited observations, such as inpainting an image using only 1% of its pixels, was a challenge. This presentation begins with classical matrix decomposition-based image recovery, leading to the idea of extreme visual recovery. We highlight the connection to masked image modeling, which can be viewed as an extreme visual recovery task due to its high level of image data obscuration. We propose a generalized approach involving noise addition and denoising, supporting unsupervised visual pre-training and aligning with diffusion model principles. This presentation will also cover our recent works on diffusion models and their intriguing synergy with reinforcement learning.

Short bio: Chang Xu is an Associate Professor and Australian Research Council Future Fellow at the School of Computer Science, University of Sydney. He received the New South Wales (NSW) Government Premier's Prize for Early Career Researcher of the Year. His research interests lie in machine learning algorithms and their applications in computer vision. He has published over 100 papers in prestigious journals and top-tier conferences and has received several paper awards, including the Distinguished Paper Award at AAAI 2023 and IJCAI 2018. He has served as an area chair at NeurIPS, ICML, ICLR, KDD, CVPR, and MM, as well as an Associate Editor at IEEE T-PAMI, IEEE T-IP, and TMLR. Additionally, he was named a Top Ten Distinguished Senior PC Member at IJCAI 2017 and an Outstanding Associate Editor at IEEE T-MM in 2022.

Igniting the Real Robot Revolution Requires Closing the “Data Gap”

Ken Goldberg

Factor Graphs FTW: From SLAM to Control to Robot Dynamics

Frank Dellaert

Robot Learning for Open-World Autonomy

Abhinav Valada

Spatial AI and emerging reasoning in end-to-end trained robotic navigation

Christian Wolf

The geometries of Lagrangian dynamics

Noémie Jaquier

RRTs and the Path to Minimalism: What Belongs in a Robot’s Brain?

Steve LaValle

Minimal Supervision, Maximal Adaptation: Vision-Driven Learning for Scene Understanding and Action

Pia Bideau

Geometric Learning: Leveraging Differential Geometry for Learning and Control

Bernardo Fichera

Towards Fast and Certifiable Nonconvex Optimal Control with Sparse Moment-SOS Relaxations

Heng Yang

Theoretical Understanding of Self-Supervised Learning

Yisen Wang

Constrained Structured Optimization: Formulations and Algorithms

Alberto De Marchi

Leveraging morphological symmetries for robot dynamics modeling and control

Daniel Ordoñez

Learning through Extreme Visual Recovery

Cheng Xu