Collective Strategies With a Master-Slave Mechanism Dominate in Spatial-Iterated Prisoner's Dilemma

Collective Strategies With a Master-Slave Mechanism Dominate in Spatial-Iterated Prisoner's Dilemma

Jiawei Li, Robert Duncan, Jingpeng Li, Ruibin Bai
Copyright: © 2021 |Pages: 12
DOI: 10.4018/IJSIR.2021100103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

How cooperation emerges and persists in a population of selfish agents is a fundamental question in evolutionary game theory. The research shows that collective strategies with master-slave mechanism (CSMSM) defeat tit-for-tat and other well-known strategies in spatial iterated prisoner's dilemma. A CSMSM identifies kin members by means of a handshaking mechanism. If the opponent is identified as non-kin, a CSMSM will always defect. Once two CSMSMs meet, they play master and slave roles. A mater defects and a slave cooperates in order to maximize the master's payoff. CSMSM outperforms non-collective strategies in spatial IPD even if there is only a small cluster of CSMSMs in the population. The existence and performance of CSMSM in spatial iterated prisoner's dilemma suggests that cooperation first appears and persists in a group of collective agents.
Article Preview
Top

1. Introduction

The prisoner’s dilemma is a two-player non-zero-sum game in which two players try to maximize their payoffs by cooperating with or betraying the other player. In the classical version of prisoner’s dilemma, each player chooses between two strategies, Cooperate (C) and Defect (D). Their payoffs can be represented by the matrix shown in Figure 1.

Figure 1.

Payoff matrix of a prisoner’s dilemma game. There is T>R>P>S.

IJSIR.2021100103.f01

When both players are rational and they make their choice independently, the theoretical outcome of the game is a Nash equilibrium, in which both players choose to defect, and each receives payoff P, which is worse for each player than the outcome they would have received if they had cooperated (Nash 1951, Chong et al 2007).

In the Iterated Prisoner's Dilemma (IPD), two players have to choose their mutual strategy repeatedly, and they also have the memory of their previous behaviors and the behaviors of the opponents. There is R > (S +T)/2, which is set to prevent any incentive to alternate between cooperation and defection. IPD is considered to be an ideal experimental platform for the evolution of cooperation among selfish individuals and it has attracted wide interest since Robert Axelrod’s IPD tournaments and his book ‘The Evolution of Cooperation’ (Axelrod 1984).

If the precise length of an IPD is known to the players, the best strategy for both players is to defect in each move. This is a conclusion from backward induction: both players will choose to defect in the final iteration because the opponent will not be able to subsequently punish the player. Given mutual defection in the final iteration, the optimal strategy in the penultimate iteration is defection for both players, and so on, back to the initial iteration. If the precise length of an IPD is infinite or unknown, mutual cooperation can also be equilibrium.

Axelrod was the first to study efficient IPD strategies by means of competitions (Axelrod 1980a, 1980b). Tit For Tat (TFT) always cooperates in the first move and then mimics whatever the opponent did in the previous move. According to Axelrod, several characteristics make TFT successful: TFT is Nice, Retaliating and Forgiving. TFT is not a Nash equilibrium and there is always a sub-game perfect equilibrium that dominates TFT, according to the Folk Theorem in game theory (Binmore 1992, Embrey, Fréchette and Yuksel 2018). On the other hand, whether or not TFT is the most efficient strategy in IPD is still unclear. Some strategies perform better than TFT in specific environments (Nowak and Sigmund 1993, Beaufils, Delahaye and Mathieu 1996).

In recent IPD competitions, strategies have appeared with identification mechanisms. With a rule-based identification mechanism, a strategy called APavlov won competition four of the 2005 IPD competition (Li 2007). Furthermore, many of the top listed strategies somehow explore the opponent by using simple mechanisms (Hingston et al 2007, Li and Kendall 2009). This shows that strategies that explore and then exploit the opponent can outperform any single non-group strategy in round-robin IPD tournaments.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 3 Issues (2023)
Volume 13: 4 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing