CS 294-125, Spring 2016: Human-compatible AI
Reading list

This list is still under construction. An empty bullet item indicates more readings to come for that week.
Books

Artificial Intelligence: A Modern Approach, 3rd edition by Stuart Russell and Peter Norvig. Pearson, 2010. (a.k.a. R+N)

Week 1 (1/26): Markov decision processes
TODO

R+N, 2, 17.1-17.4
Ch.2 explores the range of environment types, agent types, and agent-environment relationships. CH.17 deals with MDPs and solution algorithms

Week 2 (2/2): Reinforcement learning, multi-attribute utility theory, preference elicitation

TODO
R+N 16.1-16.5
(Multi-attribute) Utility Theory and Influence Diagrams
R+N 21.1-21.5
Background on reinforcement learning

Week 3 (2/9): Goal inference.

Meltzoff, Andrew N. "Understanding the intentions of others: re-enactment of intended acts by 18-month-old children." Developmental psychology 31.5 (1995): 838.

Baker, Chris L., Joshua B. Tenenbaum, and Rebecca R. Saxe. "Goal inference as inverse planning." Proceedings of the 29th Annual Meeting of the Cognitive Science Society. 2007.

Ziebart, Brian D., et al. "Planning-based prediction for pedestrians." In Proc. IROS 2009.

Week 4 (2/16): Human preferences

Kushnir, Tamar, Fei Xu, and Henry M. Wellman. "Young children use statistical sampling to infer the preferences of other people." Psychological Science 21.8 (2010): 1134-1140.
Lucas, C. G., Griffiths, T. L., Xu, F., Fawcett, C., Gopnik, A., Kushnir, T., Markson, L., & Hu, J. (2014). "The child as econometrician: A rational model of preference understanding in children." PLOS One, 9(3), e92160.
Daniel Kahneman and Amos Tversky (1979). Prospect Theory: An Analysis of Decision under Risk. Econometrica, 47(2), pp. 263-291.

Week 5 (2/23): Collaborative systems

Fern, A, Natarajan, S., Tadepalli, P., and Judah, K. (2014). A Decision-Theoretic Model of Assistance. Journal of Artificial Intelligence Research, 50.

Dragan, Anca D., and Siddhartha S. Srinivasa. "Formalizing assistive teleoperation." MIT Press, July, 2012.
Liu, C. et al. "A Framework for Autonomous Vehicles With Goal Inference and Task Allocation Capabilities to Support Peer Collaboration With Human Agents." ASME 2014 Dynamic Systems and Control Conference. American Society of Mechanical Engineers, 2014.

Week 6 (3/1): Psychology of moral decisions

Cushman, F. "Action, outcome, and value a dual-system framework for morality." Personality and social psychology review 17.3 (2013): 273-292.
Greene, Joshua D., "Beyond point-and-shoot morality: Why Cognitive (Neuro)Science Matters for Ethics." Ethics 124(4), 695-726, 2001.

[Optional extra: Greene, Joshua D., et al. "An fMRI investigation of emotional engagement in moral judgment." Science 293.5537 (2001): 2105-2108.]

[Optional extra: Lieder, F., et al. "Algorithm selection by rational metareasoning as a model of human strategy selection." In NIPS-14]

Week 7 (3/8): Inverse reinforcement learning

Andrew Ng and Stuart Russell (1998). "Algorithms for Inverse Reinforcement Learning." In International Conference on Machine Learning, 2000.
Ratliff, Nathan D., J. Andrew Bagnell, and Martin A. Zinkevich. "Maximum margin planning." Proceedings of the 23rd international conference on Machine learning. ACM, 2006.

Week 8 (3/15): Inverse reinforcement learning (cont'd)

Ziebart, Brian D., et al. "Maximum Entropy Inverse Reinforcement Learning." AAAI. 2008.

Deepak Ramachandran and Eyal Amir, Bayesian Inverse Reinforcement Learning. IJCAI 2007.

(Optional) David Silver et al., Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484-489, 2016.

Week 9 (3/22):
Spring Break

Week 10 (3/29): Multiagent Sequential Decision Making

Olihoek, Frans A. and Amato, Christopher "Dec-POMDPs as Non-Observable MDPs". IAS Technical Report, 2014.

Boutilier, Craig "Planning, Learning and Coordination in Multiagent Decision Processes". Theoretical Aspects of Rationality and Knowledge, 1996.

(Optional) Boutilier, Craig "Sequential Optimality and Coordination in Multiagent Systems". IJCAI, 1999.
This goes into more detail on solution algorithms for MMDPs that track the coordination state. This is related to the Dec-POMDP solution algorithms.

(optional) Dibangoye, Jilles S., et al. "Optimally Solving Dec-POMDPs as Continuous-State MDPs". JAIR, 2016.
In depth writeup of state-of-the-art Dec-POMDP algorithms. Long, but quite thorough.

Week 11 (4/5): Game theory

R+N 17.5, 17.6
Introduction to game theory (written for AI community), mechanism design

Gibbons, Robert "An Introduction to Applicable Game Theory" . Journal Of Economic Perspectives, 1997 .
Introduction to game theory (written for Econ community).

Gibbons, Robert "Lecture Note 1: Agency Theory" MIT 15.903 2010.
Introduction to principal agent models from economics.

Week 12 (4/12): Inverse games

Kevin Waugh, Brian D. Ziebart, and J. Andrew Bagnell, Computational Rationalization: The Inverse Equilibrium Problem. In Proc. ICML, 2011.

(Optional) Bestick, Aaron, et al. "An inverse correlated equilibrium framework for utility learning in multiplayer, noncooperative settings." Proceedings of the 2nd ACM international conference on High confidence networked systems. ACM, 2013.

Volodymyr Kuleshov and Okke Schrijvers, "Inverse game theory" Web and Internet Economics, 2015

Week 13 (4/19): Embedded reinforcement learning, Baldwinian evolution

Mark Ring and Laurent Orseau, "Delusion, Survival, and Intelligent Agents." In Proc. AGI, 2011.
Describes a possible difficulty with reward-based agents, wherein the agent builds a delusion box that produces fake rewards that make it happy.
(optional) Daniel Dewey, "Learning What to Value.". In Proc. AGI, 2011.
Argues that wireheading arises from RL formulations and proposes instead an approach based on learning an initially unknown utility function.
(optional) Bill Hibbard, "Model-based Utility Functions.". JAGI, 3(1), 1-24, 2012.
Proposes and analyzes a solution to the wireheading problem based on utility functions that depend on unobserved state variables whose values the agent must infer.
(optional) Laurent Orseau and Mark Ring, "Space-Time Embedded Intelligence.". Proc. AGI, 2012.
Defines a very general notion of rationality for agents whose computational substrate is part of the environment they inhabit.

David Ackley and Michael Littman, Interactions between learning and evolution. In Proc. Artificial Life II, 1991.
Discusses the origin of reward functions and how learning speeds up evolution, clarifying the Baldwin effect first proposed in 1896.

Week 14 (4/26): Corrigibility

Soares, Nate, et al. "Corrigibility." Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

Week 15:
Reading/Review/Recitation

CS 294-125, Spring 2016: Human-compatible AI Reading list

Books

Week 1 (1/26): Markov decision processes

Week 2 (2/2): Reinforcement learning, multi-attribute utility theory, preference elicitation

Week 3 (2/9): Goal inference.

Week 4 (2/16): Human preferences

Week 5 (2/23): Collaborative systems

Week 6 (3/1): Psychology of moral decisions

Week 7 (3/8): Inverse reinforcement learning

Week 8 (3/15): Inverse reinforcement learning (cont'd)

Week 9 (3/22):

Week 10 (3/29): Multiagent Sequential Decision Making

Week 11 (4/5): Game theory

Week 12 (4/12): Inverse games

Week 13 (4/19): Embedded reinforcement learning, Baldwinian evolution

Week 14 (4/26): Corrigibility

Week 15:

CS 294-125, Spring 2016: Human-compatible AI
Reading list