Our current research is primarily studying deep learning for robotics, where learning could be from demonstrations (apprenticeship learning) or through the robot's own trial and error (reinforcement learning). Targeted application domains include autonomous manipulation, flight, locomotion and driving.


As more and more data is being produced, and more and more computational power continues to become available, an important opportunity lies in harnessing data towards autonomy. In recent years computer vision and speech recognition have not only made significant leaps forward, but also rapidly increased their rate of progress, largely thanks to developments in deep learning.[Nature 2015, Deep Learning] While deep learning isn't necessarily the only way to harness the ever-growing amounts of data, it is the first, and currently only, machine learning approach that has demonstrated the ability to continue to improve performance on real world problems as more data and more compute cycles are being made available --- in contrast to more traditional approaches, which have tended to saturate at some level of performance. The amount of data these deep neural nets are trained with is very large. For example, the landmark [Krizhevsky et al. 2012] paper (not even three years old, and already cited over 1700 times), which was the first to demonstrate deep learning outperform (and significantly so) more traditional approaches to computer vision processed 200 billion images during training (amplified by shifting and re-coloring from an original labeled data-set of 1.2 million images).

Thus far the impact of deep learning has largely been in so-called supervised learning, of which image recognition and speech recognition are examples. In supervised learning one receives example inputs (e.g., images) and corresponding labels (e.g., 'cat', 'dog', etc. depending on what's in the image). The system is supposed to learn to then make correct label predictions on future inputs. Autonomous systems aren't simply presented with a set of inputs for which they need to predict a label, rather autonomous systems are presented with inputs and based on those inputs need to take actions, which will in turn will affect the next inputs encountered and so forth. In this process the autonomous system will typically be expected to optimize some performance metric, yet when starting out it will not yet know the best strategy to do so. It might be expected to figure out this strategy on its own through trial and error, or through collecting data that illustrates solutions, e.g., from how-to videos on YouTube or from watching a live human demonstration.

While many advances and discoveries will need to be made, it is far from inconceivable that advancing deep learning to make it applicable to autonomous robotics has a similarly transformative upside as has been happening in computer vision and speech recognition. Preliminary results on learning to play Atari games at human level [Nature 2015, Deepmind], and learning real-world visuo-motor control policies [ICRA 2015, late-breaking] further reinforce this outlook.


Deep Learning, Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Nature, May 2015.

Imagenet classification with deep convolutional neural networks, Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. Advances in Neural Information Processing Systems, 2012.

Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis. Nature, February 2015.

Sergey Levine*, Chelsea Finn*, Trevor Darrell, Pieter Abbeel. End-to-End Training of Deep Visuomotor Policies. Presented at the IEEE International Conference on Robotics and Automation (ICRA) 2015 Late Breaking Results Session, May 2015.


A recent talk by my post-doc Sergey Levine, which covers some of the more recent work in my group (UW, 2015/3/18).

Here is a talk by me covering some of our older/earlier work (CMU, 2013/10/18).