\documentclass[twoside]{article} \setlength{\oddsidemargin}{0.25 in} \setlength{\evensidemargin}{-0.25 in} \setlength{\topmargin}{-0.6 in} \setlength{\textwidth}{6.5 in} \setlength{\textheight}{8.5 in} \setlength{\headsep}{0.75 in} \setlength{\parindent}{0 in} \setlength{\parskip}{0.1 in} % % The following commands set up the lecnum (lecture number) % counter and make various numbering schemes work relative % to the lecture number. % \newcounter{lecnum} \renewcommand{\thepage}{\thelecnum-\arabic{page}} \renewcommand{\thesection}{\thelecnum.\arabic{section}} \renewcommand{\theequation}{\thelecnum.\arabic{equation}} \renewcommand{\thefigure}{\thelecnum.\arabic{figure}} \renewcommand{\thetable}{\thelecnum.\arabic{table}} % % The following macro is used to generate the header. % \newcommand{\lecture}[4]{ \pagestyle{myheadings} \thispagestyle{plain} \newpage \setcounter{lecnum}{#1} \setcounter{page}{1} \noindent \begin{center} \framebox{ \vbox{\vspace{2mm} \hbox to 6.28in { {\bf CS294~Game Theory and the Internet \hfill Fall 2003} } \vspace{4mm} \hbox to 6.28in { {\Large \hfill Lecture #1: #2 \hfill} } \vspace{2mm} \hbox to 6.28in { {\it Lecturer: #3 \hfill Scribe: #4} } \vspace{2mm}} } \end{center} \markboth{Lecture #1: #2}{Lecture #1: #2} {\bf Disclaimer}: {\it These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor.} \vspace*{4mm} } % % Convention for citations is authors' initials followed by the year. % For example, to cite a paper by Leighton and Maggs you would type % \cite{LM89}, and to cite a paper by Strassen you would type \cite{S69}. % (To avoid bibliography problems, for now we redefine the \cite command.) % Also commands that create a suitable format for the reference list. %\renewcommand{\cite}[1]{[#1]} \def\beginrefs{\begin{list}% {[\arabic{equation}]}{\usecounter{equation} \setlength{\leftmargin}{2.0truecm}\setlength{\labelsep}{0.4truecm}% \setlength{\labelwidth}{1.6truecm}}} \def\endrefs{\end{list}} \def\bibentry#1{\item[\hbox{[#1]}]} %Use this command for a figure; it puts a figure in wherever you want it. %usage: \fig{NUMBER}{SPACE-IN-INCHES}{CAPTION} \newcommand{\fig}[3]{ \vspace{#2} \begin{center} Figure \thelecnum.#1:~#3 \end{center} } % Use these for theorems, lemmas, proofs, etc. \newtheorem{theorem}{Theorem}[lecnum] \newtheorem{lemma}[theorem]{Lemma} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{claim}[theorem]{Claim} \newtheorem{corollary}[theorem]{Corollary} \newtheorem{definition}[theorem]{Definition} \newenvironment{proof}{{\bf Proof:}}{\hfill\rule{2mm}{2mm}} % **** IF YOU WANT TO DEFINE ADDITIONAL MACROS FOR YOURSELF, PUT THEM HERE: \newcommand{\expect}[1]{E \!\left[\,#1\right]} \newcommand{\prob}[2]{\Pr_{#1}\left[#2\right]} \newcommand{\st}{\hbox{ s.t. }} \newcommand{\elt}{\in} %% use epsf for putting pictures into the scribe notes %% note that your latex will need the correct path to find this package \usepackage{epsf} \usepackage{graphicx} \begin{document} %FILL IN THE RIGHT INFO. %\lecture{**LECTURE-NUMBER**}{**DATE**}{**LECTURER**}{**SCRIBE**} \lecture{4}{9.8.03}{Christos Papadimitriou}{Brian Milch} %% start here! \section{Nash Equilibria in Zero-Sum Games} \subsection{Minimax equality} Recall Nash's theorem, which states that every game has a Nash equilibrium. If a game has two players, we can represent it using a payoff matrix $A$ for player 1 and a payoff matrix $B$ for player 2. Nash's theorem implies the following corollary for the special case of 2-person \emph{zero-sum games}, where $B = -A$. \begin{corollary}[Minimax equality] \label{cor:minimax} In any 2-person zero-sum game: \begin{equation} \label{eq:minimax} \max_x \min_y\, xAy = \min_y \max_x\, xAy \end{equation} where $x$ and $y$ are probability distributions over actions for players 1 and 2, respectively. \end{corollary} To understand this, remember that $xAy$ is player 1's expected payoff when the strategy profile is $(x, y)$. We obtain the lefthand side of Eq.~\ref{eq:minimax} as follows: if player 1 chooses any (possibly mixed) strategy $x$, player 2 will choose a strategy to give player 1 the lowest expected payoff, $\min_y xAy$. So player 1 must choose a strategy to maximize this value. We get the righthand side by thinking about the players in reverse order; the theorem says we get the same result either way. \begin{proof} By Nash's theorem, the game has a Nash equilibrium, call it $(x^*, y^*)$. Because $x^*$ must be a best response for player 1, we have: \begin{equation} \label{eq:max} x^* A y^* = \max_x xAy^* \geq \max_x \min_y xAy \geq \min_y x^*Ay \end{equation} We also know that $y^*$ maximizes $x^*By$. But since the game is zero-sum, this means $y^*$ minimizes $x^*Ay$. Thus: \begin{equation} \label{eq:min} x^* A y^* = \min_y x^*Ay \leq \min_y \max_x xAy \leq \max_x xAy^* \end{equation} Thus, both $\max_x \min_y xAy$ and $\min_y \max_x xAy$ are lower-bounded by $\min_y x^*Ay$ and upper-bounded by $\max_x xAy^*$. But these lower and upper bounds both equal $x^* A y^*$, so we have shown the desired equality. \end{proof} \pagebreak \subsection{Derivation from LP duality} An example of a zero-sum game is rock-paper-scissors, where player 1's payoffs are: \[ \left[ \begin{array}{rrr} 0 & 1 & -1 \\ -1 & 0 & 1 \\ 1 & -1 & 0 \end{array} \right] \] Let $z$ be the value of player 1's best move. Then player 2 faces the following optimization problem: \[ \begin{array}{ll} \min_y z & \st \\ & \begin{array}{rcrcrl} & & y_2 & - & y_3 & \leq z \\ -y_1 & & & + & y_3 & \leq z \\ y_1 & - & y_2 & & & \leq z \\ y_1 & + & y_2 & + & y_3 & = 1 \\ y_1 & , & y_2 & , & y_3 & \geq 0 \end{array} \end{array} \] The first three constraints say that none of player 1's moves yield an expected payoff greater than $z$; the remaining constraints say that $y$ is a probability distribution over player 2's moves. So we have a linear program. The dual of this LP is: \[ \begin{array}{ll} \max_x w & \st \\ & \begin{array}{rcrcrl} & & -x_2 & + & x_3 & \geq w \\ x_1 & & & - & x_3 & \geq w \\ -x_1 & + & x_2 & & & \geq w \\ x_1 & + & x_2 & + & x_3 & = 1 \\ x_1 & , & x_2 & , & x_3 & \geq 0 \end{array} \end{array} \] We can interpret $w$ as the value (for player 1) of player 2's best move. By LP duality, the optimum value of this dual LP is the same as that of the primal LP. So we see that Corollary~\ref{cor:minimax} follows from LP duality; we don't need to use Nash's theorem for the proof. In fact, the minimax equality was proven by von Neumann in 1928, decades before Nash's work. \section{Correlated Equilibria} \subsection{The chicken game and joint randomization} In the \emph{chicken game}, each player decides whether to chicken out, or dare to keep going. If one player dares and the other chickens, the daring one winds. But if both players dare, the outcome is disastrous for both, whereas if both chicken, the outcome is ok. Specifically, the payoff matrix is: \[ \begin{array}{r|c|c|} & \textrm{chicken} & \textrm{dare} \\ \hline \textrm{chicken} & 4,4 & 1,5 \\ \hline \textrm{dare} & 5,1 & 0,0 \\ \hline \end{array} \] This game has three Nash equilibria: \begin{itemize} \item player 1 dares: payoffs 5,1 \item player 2 dares: payoffs 1,5 \item $(\frac{1}{2}, \frac{1}{2}) \times (\frac{1}{2}, \frac{1}{2})$: payoffs 2.5, 2.5 \end{itemize} Note that with this payoff matrix, daring is not very lucrative: it only gives the winner a bonus of 1. Suppose the players cannot sign a binding contract not to dare. However, there is a publicly observable random event---such as a coin toss---that comes up ``heads'' 50\% of the time and ``tails'' 50\% of the time. The players can agree that player 1 will dare if it's ``heads'', player 2 will dare if it's ``tails'', and the other player will chicken in each case. This agreement can hold with no external enforcement: no player has an incentive to dare when he should chicken (because the payoff for chickening when the other player dares is greater than the payoff for both daring). With this agreement, the expected payoffs are $3,3$. Can we obtain an even better pair of payoffs? Suppose a third party chooses a cell in the game matrix according to the distribution: \[ \begin{array}{r|c|c|} & \textrm{chicken} & \textrm{dare} \\ \hline \textrm{chicken} & 1/3 & 1/3 \\ \hline \textrm{dare} & 1/3 & 0 \\ \hline \end{array} \] The third party then tells each player what to play, without revealing what cell was chosen. Then no player has an incentive to disobey the third party, and the expected payoffs are $3 \frac{1}{3}, 3 \frac{1}{3}$. \subsection{Correlated equilibrium} These agreements are examples of correlated equilibria. In a correlated equilibrium, someone tosses a coin (or rolls a die, etc.) and gives each player partial information about the result, in the form of a recommendation of how to play. An equilibrium exists in that no player has an incentive to disobey the recommendation.~\footnote{It can be shown that if there are at least 4 players, ``private lines'' are not necessary: the result of the coin toss can be made public.} More formally: \begin{definition} A \emph{correlated equilibrium} $p$ is a distribution on the joint strategy set $S_1 \times S_2 \times \cdots \times S_n$ such that $\forall i$, $\forall t \neq t' \in S_i$: \[ \sum_{\vec{s}_{-i}} p(\vec{s}_{-i}, t) \cdot u_i(\vec{s}_{-i}, t) \geq \sum_{\vec{s}_{-i}} p(\vec{s}_{-i}, t) \cdot u_i(\vec{s}_{-i}, t') \] That is, for each recommendation $t$, player $i$ has no incentive to deviate to any $t'$. \end{definition} A Nash equilibrium is simply a correlated equilibrium where $p$ is a product of independent distributions for each player. So Nash's theorem implies that a correlated equilibrium exists in every game. Moreover, it turns out that we can find a correlated equilibrium in polynomial time. For example, in the chicken game, let $(x,y,z,w)$ be the entries in the probability distribution $p$: \[ \begin{array}{r|c|c|} & \textrm{chicken} & \textrm{dare} \\ \hline \textrm{chicken} & 4,4 \quad x & 1,5 \quad y \\ \hline \textrm{dare} & 5,1 \quad z & 0,0 \quad w \\ \hline \end{array} \] If we want to maximize, for example, the sum of expected payoffs to all players, we get a linear program: \begin{eqnarray*} \max_{x,y,z,w} 8x + 6y + 6z \st \\ 4x + 1y \geq 5x + 0y \\ \vdots \\ \textrm{linear eq for each $i, t \in S_i$} \end{eqnarray*} Solving this LP shows that $(1/3, 1/3, 1/3, 0)$ is the best correlated equilibrium according to this criterion. The prisoners' dilemma game has the matrix: \[ \begin{array}{r|c|c|} & \textrm{cooperate} & \textrm{defect} \\ \hline \textrm{cooperate} & 4,4 & 1,5 \\ \hline \textrm{defect} & 5,1 & 2,2 \\ \hline \end{array} \] Note that the payoff for cooperating when the other player defects is now \emph{lower} than the payoff for both defecting. In this game, the only correlated equilibrium is the Nash equilibrium (defect, defect). \subsection{Binding contracts} What if the players can make binding contracts? We can model these by adding new pure strategies to the game. For example, the matrix below shows what happens when we add to the prisoners' dilemma a contract that says both players will cooperate: \[ \begin{array}{r|c|c|c|} & \textrm{cooperate} & \textrm{defect} & \textrm{sign}\\ \hline \textrm{cooperate} & 4,4 & 1,5 & 1,5 \\ \hline \textrm{defect} & 5,1 & 2,2 & 2,2 \\ \hline \textrm{sign} & 5,1 & 2,2 & 4,4 \\ \hline \end{array} \] The contract only has force if both players sign; if one signs and the other doesn't, then the signer gets to observe the other player's move and choose his best response. Note that signing is a dominant strategy for each player in this new game. But the payoffs 4,4 were not achieved by any equilibrium (even a correlated one) in the original game, so binding contracts are strictly more powerful than at-will coordination. In general, a contract specifies a rule, ``if this subset of the players signs, then they play this (possibly mixed) strategy vector'', for each possible subset of the players. So in principle, we can construct a game with a pure strategy for signing each of the infinitely many possible contracts. The complete set of expected payoff vectors achievable by mixed strategies (not necessarily equilibria) in the prisoners' dilemma is outlined by the dotted diamond in Fig.~\ref{fig:payoffs}. The vertical and horizontal lines show the payoffs that players 1 and 2 can guarantee for themselves without a contract (by always daring). Using contracts, we can obtain an equilibrium at any point in the shaded polygon: these points are called \emph{individually rational outcomes}. The idea is that accepting a contract is rational for a player if it gives him a better expected payoff than he could guarantee himself on his own. \begin{figure}[b] \centerline{ \epsfbox{coord.eps} } \caption{Payoff vectors for the prisoners' dilemma.} \label{fig:payoffs} \end{figure} \end{document}