Program equilibrium: Difference between revisions

From testwiki
Jump to navigation Jump to search
imported>LR.127
Changing short description from "Program equilibrium" to "Game theory"
 
(No difference)

Latest revision as of 01:28, 5 September 2024

Template:Short description

Program equilibrium is a game-theoretic solution concept for a scenario in which players submit computer programs to play the game on their behalf and the programs can read each other's source code. The term was introduced by Moshe Tennenholtz in 2004.[1] The same setting had previously been studied by R. Preston McAfee,[2] J. V. Howard[3] and Ariel Rubinstein.[4]

Setting and definition

The program equilibrium literature considers the following setting. Consider a normal-form game as a base game. For simplicity, consider a two-player game in which S1 and S2 are the sets of available strategies and u1 and u2 are the players' utility functions. Then we construct a new (normal-form) program game in which each player i chooses a computer program pi. The payoff (utility) for the players is then determined as follows. Each player's program pi is run with the other program pi as input and outputs a strategy si for Player i. For convenience one also often imagines that programs can access their own source code.Template:Refn Finally, the utilities for the players are given by ui(s1,s2) for i=1,2, i.e., by applying the utility functions for the base game to the chosen strategies.

One has to further deal with the possibility that one of the programs pi doesn't halt. One way to deal with this is to restrict both players' sets of available programs to prevent non-halting programs.[1][5]

A program equilibrium is a pair of programs (p1,p2) that constitute a Nash equilibrium of the program game. In other words, (p1,p2) is a program equilibrium if neither player i can deviate to an alternative program pi such that their utility is higher in (pi,pi) than in (p1,p2).

Instead of programs, some authors have the players submit other kinds of objects, such as logical formulas specifying what action to play depending on an encoding of the logical formula submitted by the opponent.[6][7]

Different mechanisms for achieving cooperative program equilibrium in the Prisoner's Dilemma

Various authors have proposed ways to achieve cooperative program equilibrium in the Prisoner's Dilemma.

Cooperation based on syntactic comparison

Multiple authors have independently proposed the following program for the Prisoner's Dilemma:[1][3][2]

algorithm CliqueBot(opponent_program):
    if opponent_program == this_program then
        return Cooperate
    else
        return Defect

If both players submit this program, then the if-clause will resolve to true in the execution of both programs. As a result, both programs will cooperate. Moreover, (CliqueBot,CliqueBot) is an equilibrium. If either player deviates to some other program pi that is different from CliqueBot, then the opponent will defect. Therefore, deviating to pi can at best result in the payoff of mutual defection, which is worse than the payoff of mutual cooperation.

This approach has been criticized for being fragile.[5] If the players fail to coordinate on the exact source code they submit (for example, if one player adds an extra space character), both programs will defect. The development of the techniques below is in part motivated by this fragility issue.

Proof-based cooperation

Another approach is based on letting each player's program try to prove something about the opponent's program or about how the two programs relate.[6][8][9][10] One example of such a program is the following:

algorithm FairBot(opponent_program):
    if there is a proof that opponent_program(this_program) = Cooperate then
        return Cooperate
    else
        return Defect

Using Löb's theorem it can be shown that when both players submit this program, they cooperate against each other.[8][9][10] Moreover, if one player were to instead submit a program that defects against the above program, then (assuming consistency of the proof system is used) the if-condition would resolve to false and the above program would defect. Therefore, (FairBot,FairBot) is a program equilibrium as well.

Cooperating with ε-grounded simulation

Another proposed program is the following:[5][11]

algorithm ϵGroundedFairBot(opponent_program):
    With probability ϵ:
        return Cooperate
    return opponent_program(this_program)

Here ϵ is a small positive number.

If both players submit this program, then they terminate almost surely and cooperate. The expected number of steps to termination is given by the geometric series. Moreover, if both players submit this program, neither can profitably deviate, assuming ϵ is sufficiently small, because defecting with probability Δ would cause the opponent to defect with probability (1ϵ)Δ.

Folk theorem

We here give a theorem that characterizes what payoffs can be achieved in program equilibrium.

The theorem uses the following terminology: A pair of payoffs (v1,v2) is called feasible if there is a pair of (potentially mixed) strategies (s1,s2) such that ui(s1,s2)=vi for both players i. That is, a pair of payoffs is called feasible if it is achieved in some strategy profile. A payoff vi is called individually rational if it is better than that player's minimax payoff; that is, if viminσimaxsiui(σi,si), where the minimum is over all mixed strategies for Player i.Template:Refn

Theorem (folk theorem for program equilibrium):[4][1] Let G be a base game. Let (v1,v2) be a pair of real-valued payoffs. Then the following two claims are equivalent:

  • The payoffs (v1,v2) are feasible and individually rational.
  • There is a program equilibrium (p1,p2) that achieves payoffs (v1,v2).

The result is referred to as a folk theorem in reference to the so-called folk theorems (game theory) for repeated games, which use the same conditions on equilibrium payoffs (v1,v2).

See also

Notes

Template:Reflist

References

Template:Reflist

  1. 1.0 1.1 1.2 1.3 Cite error: Invalid <ref> tag; no text was provided for refs named Tennenholtz2004
  2. 2.0 2.1 Cite error: Invalid <ref> tag; no text was provided for refs named McAfee1984
  3. 3.0 3.1 Cite error: Invalid <ref> tag; no text was provided for refs named Howard1988
  4. 4.0 4.1 Cite error: Invalid <ref> tag; no text was provided for refs named Rubinstein1998
  5. 5.0 5.1 5.2 Cite error: Invalid <ref> tag; no text was provided for refs named Oesterheld2019
  6. 6.0 6.1 Cite error: Invalid <ref> tag; no text was provided for refs named Hoek2013
  7. Cite error: Invalid <ref> tag; no text was provided for refs named PetersSzentes2012
  8. 8.0 8.1 Cite error: Invalid <ref> tag; no text was provided for refs named Barasz2014
  9. 9.0 9.1 Cite error: Invalid <ref> tag; no text was provided for refs named Critch2019
  10. 10.0 10.1 Cite error: Invalid <ref> tag; no text was provided for refs named Critch2022
  11. Cite error: Invalid <ref> tag; no text was provided for refs named DiGiovanni2023