University of California at Berkeley
Dept of Electrical Engineering & Computer Sciences

CS 287: Advanced Robotics, Fall 2009

Instructor: Pieter Abbeel
Lectures: Tuesdays and Thursdays, 12:30pm-2:00pm, 405 Soda Hall
Office Hours: Thursdays 2:00-3:00pm (and by email arrangement) in 746 Sutardja Dai Hall



Course description

A tentative list of topics includes (subject to substantial change!):


Familiarity with mathematical proofs, probability, algorithms, linear algebra; ability to implement algorithmic ideas in code.

Consent of instructor required for undergraduate students.


Assignment policy

Syllabus and materials

  1. AM: Astrom and Murray, Feedback Systems, pdf
  2. T: Tedrake, Underactuated Robotics, Course Notes for MIT 6.832. August 2009 snapshot (referenced on this page and in the slides) [Here is a link to the current working draft ]
  3. SB: Sutton and Barto, Reinforcement Learning, html
  4. BT: Bertsekas and Tsitsiklis, Neuro-dynamic programming
  5. TBF: Thrun, Burgard, Fox, Probabilistic Robotics
  6. [optional readings] Slotine and Li, Applied Nonlinear Control --- great read on the topic
Lecture Topic Notes Readings Optional/Additional Readings Videos
Th Aug 27 Course introduction. 2pp 6pp
Tu Sep 1 Feedforward, Feedback, PID, Control of fully actuated systems 2pp 6pp code AM 10.3, 10.4; T 1.2
Th Sep 3 Lyapunov, (Energy pumping) 2pp 6pp T Ch. 3; SL, Example 3.21
Tu Sep 8 Optimal Control, HJB, Discretization 2pp 6pp T Ch. 6; SB Ch. 4; Munos and Moore, MLJ 2001 pp.1-9 pdf; Chow and Tsitsiklis, 1991 pdf Kushner and Dupuis, 1992/2001
Th Sep 10 Dynamic programming with function approximation 2pp 6pp Gordon, 1995 pdf; Tsitsiklis and Van Roy, 1996 pdf
Tu Sep 15 No lecture.
Th Sep 17 Dynamic programming with function approximation; speed-ups/tweaks 2pp 6pp Gordon, 1995 pdf; Tsitsiklis and Van Roy, 1996 pdf BT 6.5, 6.1; Moore and Atkeson, Prioritized sweeping pdf
Tu Sep 22 LQR + variations 2pp 6pp Tedrake, LQR trees pdf, Atkeson and Stephens pdf, Todorov, 2005 pdf, Anderson and Moore, Optimal Control: Linear Quadratic Methods
Th Sep 24 MPC, feedback linearization draft slides: 2pp 6pp Tedrake Ch. 9 and App. A Diehl +al., MPC overview pdf; John T. Betts, "Practical Methods for Optimal Control Using Nonlinear Programming," 2001; Slotine and Li Chapter 6; Isidori, "Nonlinear control systems," 1989.
Tu Sep 29 Bandits draft notes: pdf, draft slides: 2pp 6pp Regret-based approaches: Lai and Robbins, 1985; Auer +al, UCB algorithm 1998 pdf; papers on Bayesian exploration in MDPs: Poupart+al, Asmuth+al, Kolter+Ng
Thu Oct 1 Policy iteration If you wanted to read ahead for this and the next few lecturs: See chapters 4, 6, 7, 8, 11.1 of SB SB Chapters 4, 5, 6, 7, 8, 11.1 html
Tue Oct 6 Example MDPs, Recap some exact methods for MDPs 2pp 6pp SB Chapter 4
Thu Oct 8 Linear programming; Model-free methods (TD) 2pp 6pp LP notes SB Chapter 6
Tu Oct 13 TD, sarsa, Q, TD(\lambda) 2pp 6pp SB Chapters 6, 7
Thu Oct 15 TD with function approximation, TD Gammon, other examples 2pp 6pp SB Chapters 8, 11.1 Tsitsiklis and Van Roy, 1997 pdf,
Tu Oct 20 LSTD, LSPI, RLSTD, behavioral cloning 2pp 6pp LSPI pdf, Bradtke and Barto, 1996, LSTD pdf , Kolter and Ng, Feature selection in LSTD pdf
Thu Oct 22 Behavioral cloning, Inverse RL 2pp 6pp More complete slides on Inverse RL from Robot Learning Summer School, 2009 pdf
Tu Oct 27 Inverse RL wrap-up, Policy search 2pp 6pp
Thu Oct 29 Policy search & Actor-Critic 2pp 6pp Peters and Schaal, IROS 2006, Policy gradient methods for robotics; Ng and Jordan UAI 2000, an analysis of fixing the random seed in policy search
Tu Nov 3 An application of likelihood ratio methods: Learning to walk 2pp 6pp Lecture by Russ Tedrake on learning to walk
Tedrake, Zhang and Seung, Learning to walk in 20 minutes
toddler, tinker-toy, Cornell kneed-walker
Thu Nov 5 Natural gradient; briefs on various topics incl. approximate LP, pomdp's, reward shaping, exploration vs. exploitation, hierarchical methods 2pp 6pp Kakade, NIPS 2002 Natural policy gradient; Peters and Schaal natural actor critic ; Calafiore and Campi constraint sampling; de Farias and Van Roy constraint sampling ALP; Kearns and Singh E3; Ng, Harada and Russell reward shaping; Wiewiora reward shaping equivalence with V, Q initialization; Marthi, Russell and Andre hierarchical Q decomposition ; Kober, Peters ball in a cup (nips2008) ball-in-a-cup-video
Tue Nov 10 State estimation: HMM, KF 2pp 6pp Probabilistic Robotics Chapters 1, 2, 3 From Gauss to Kalman
Thu Nov 12 State estimation: KF, EKF, UKF 2pp 6pp Probabilistic Robotics Chapter 3 (Gaussian filters), 10 (EKF SLAM), Julier and Uhlmann, the UKF
Tu Nov 17 State estimation: UKF, particle filter, mapping, localization, SLAM 2pp 6pp Probabilistic Robotics Chapter 4 (particle filters), 6 (robot perception), 8 (localization); particle filter tutorial
Thu Nov 19 State estimation: mapping, SLAM 2pp 6pp Probabilistic Robotics Chapters 9 (occupancy grid mapping), 13 (fastSLAM), 11 (graphSLAM) Bailey and Durrant-Whyte SLAM tutorial: part 1; part 2
Tue Nov 24 Quadruped locomotion --- Guest Lecturer: J. Zico Kolter speaker bio 2009 slides 2008 slides
Thu Nov 26 Happy Thanksgiving!
Tue Dec 1 Project presentations. Barron, Smith, Liu, Kolev, Lin, Strausser, Chang-Siu, Moldovan, Soerensen
Thu Dec 3 Project presentations. Hoburg, Weekly, Song, Swift, Hunter, Javdani, Berg Kirkpatrick, Singh+Tang, Maitin-Shepard

Related courses