Robotic Manipulation

Perception, Planning, and Control

Russ Tedrake

© Russ Tedrake, 2020-2024
Last modified .
How to cite these notes, use annotations, and give feedback.

Note: These are working notes used for a course being taught at MIT. They will be updated throughout the Fall 2024 semester.

PDF version of the notes

You can also download a PDF version of these notes (updated much less frequently) from here.

The PDF version of these notes are autogenerated from the HTML version. There are a few conversion/formatting artifacts that are easy to fix (please feel free to point them out). But there are also interactive elements in the HTML version are not easy to put into the PDF. When possible, I try to provide a link. But I consider the online HTML version to be the main version.

Table of Contents

You can find documentation for the source code supporting these notes here.

Preface

I've always loved robots, but it's only relatively recently that I've turned my attention to robotic manipulation. I particularly like the challenge of building robots that can master physics to achieve human/animal-like dexterity and agility. It was passive dynamic walkers and the beautiful analysis that accompanies them that first helped cement this centrality of dynamics in my view of the world and my approach to robotics. From there I became fascinated with (experimental) fluid dynamics, and the idea that birds with articulated wings actually "manipulate" the air to achieve incredible efficiency and agility. Humanoid robots and fast-flying aerial vehicles in clutter forced me to start thinking more deeply about the role of perception in dynamics and control. Now I believe that this interplay between perception and dynamics is truly fundamental, and I am passionate about the observation that relatively "simple" problems in manipulation (how do I button up my dress shirt?) expose the problem beautifully.

My approach to programming robots has always been very computational/algorithmic. I started out using tools primarily from machine learning (especially reinforcement learning) to develop the control systems for simple walking machines; but as the robots and tasks got more complex I turned to more sophisticated tools from model-based planning and optimization-based control. In my view, no other discipline has thought so deeply about dynamics as has control theory, and the algorithmic efficiency and guaranteed performance/robustness that can be obtained by the best model-based control algorithms far surpasses what we can do today with learning control. Unfortunately, the mathematical maturity of controls-related research has also led the field to be relatively conservative in their assumptions and problem formulations; the requirements for robotic manipulation break these assumptions. For example, robust control typically assumes dynamics that are (nearly) smooth and uncertainty that can be represented by simple distributions or simple sets; but in robotic manipulation, we must deal with the non-smooth mechanics of contact and uncertainty that comes from varied lighting conditions, and different numbers of objects with unknown geometry and dynamics. In practice, no state-of-the-art robotic manipulation system to date (that I know of) uses rigorous control theory to design even the low-level feedback that determines when a robot makes and breaks contact with the objects it is manipulating. An explicit goal of these notes is to try to change that.

In the past few years, deep learning has had an unquestionable impact on robotic perception, unblocking some of the most daunting challenges in performing manipulation outside of a laboratory or factory environment. We will discuss relevant tools from deep learning for object recognition, segmentation, pose/keypoint estimation, shape completion, etc. Now relatively old approaches to learning control are also enjoying an incredible surge in popularity, fueled in part by massive computing power and increasingly available robot hardware and simulators. Unlike learning for perception, learning control algorithms are still far from a technology, with some of the most impressive looking results still being hard to understand and to reproduce. But the recent work in this area has unquestionably highlighted the pitfalls of the conservatism taken by the controls community. Learning researchers are boldly formulating much more aggressive and exciting problems for robotic manipulation than we have seen before -- in many cases we are realizing that some manipulation tasks are actually quite easy, but in other cases we are finding problems that are still fundamentally hard.

Finally, it feels that the time is ripe for robotic manipulation to have a real and dramatic impact in the world, in fields from logistics to home robots. Over the last few years, we've seen UAVs/drones transition from academic curiosities into consumer products. Even more recently, autonomous driving has transitioned from academic research to industry, at least in terms of dollars invested. Manipulation feels like the next big thing that will make the leap from robotic research to practice. It's still a bit risky for a venture capitalist to invest in, but nobody doubts the size of the market once we have the technology. How lucky are we to potentially be able to play a role in that transition?

So this is where the notes begin... we are at an incredible crossroads between learning and control and robotics with an opportunity to have immediate impact in industrial and consumer applications and potentially even to forge entirely new eras for systems theory and controls. I'm just trying to hold on and to enjoy the ride.

A manipulation toolbox

Another explicit goal of these lecture notes is to provide high-quality implementations of the most useful tools in a manipulation scientist's toolbox. When I am forced to choose between mathematical clarity and runtime performance, the clear formulation is always my first priority; I will try to include a performant formulation, too, if possible or try to give pointers to alternatives. Manipulation research is moving quickly, and I aim to evolve these notes to keep pace. I hope that the software components provided in and in these notes can be directly useful to you in your own work.

If you would like to replicate any or all of the hardware that we use for these notes, you can find information and instructions in the appendix.

As you use the code, please consider contributing back (especially to the mature code in ). Even questions/bug reports can be important contributions. If you have questions/find issues with these notes, please submit them here.

First chapter