Mitigating Generative Agent Social Dilemmas

1Massachusetts Institute of Technology, 2UC Berkeley

Abstract

In social dilemmas, individuals would be better off cooperating but fail to do so due to conflicting interests that discourage cooperation. Existing work on social dilemmas in AI has focused on standard agent design paradigms, most recently in the context of multi-agent reinforcement learning (MARL). However, with the rise of large language models (LLMs), a new design paradigm for AI systems has started to emerge---generative agents, in which actions performed by agents are chosen by prompting LLMs. This paradigm has seen recent success, such as Voyager, a highly capable Minecraft agent. In this work, we perform an initial study of outcomes that arise when deploying generative agents in social dilemmas. To do this, we build a multi-agent Voyager framework with a contracting and judgement mechanism based on formal contracting, which has been effective in mitigating social dilemmas in MARL. We then construct social dilemmas in Minecraft as the testbed for our open-source framework. Finally, we conduct preliminary experiments using our framework to provide evidence that contracting helps improve outcomes for generative agents in social dilemmas.

Interpolate start reference image.

The Multi-Agent Voyager Contracting framework consists of three components: 1) contract negotiation, 2) contract judgement, and 3) the Voyager iterative prompting mechanism. We modify the original Voyager agent by removing the Automatic Curriculum and Skill Library and introducing components 1 and 2. This framework enables agents to reach contractual agreements and iteratively adapt a policy to maximize reward with self and judge feedback.

Meet the Voyagers:

Gizmo
Favorite color: blue
Favorite mob: squid
Likes: piƱa coladas

Gizmo
Glitch

Glitch
Favorite color: green
Favorite mob: creeper
Likes: getting caught in the rain

Double-Vein Domain

Players get points for mining ore. One pickaxe (iron) can mine diamond ore while the other (stone) cannot. Diamond and iron ores are worth different amounts to different players.

Interpolate start reference image. Interpolate start reference image.

Introducing contracting in Double-Vein allows players to improve on the baseline no-contracting behavior. Both contracts shown are able to reach the socially optimal result by preferentially distributing iron and diamond to the agents who value them highest. The baseline is socially suboptimal because the resource distribution was consistent with selfishly maximizing agents.

Cleanup Domain

Players collect mushroom from one of three giant mushrooms. Mushrooms only regrow if the slime in the river is below a threshold of 7 blocks.

Interpolate start reference image. Interpolate start reference image.

Generated contracts in Cleanup allow players to reach more socially optimal outcomes than baseline. Cleanup-Contract-1 performs best as it enforces full-specialization with no task switching, the most efficient strategy found. On the other hand, Cleanup-Contract-2 enforces specialization with regular task switching. We observe this comes at a cost and achieves less social welfare than Cleanup-Contract-1.

Glitch

(Thought): I should aim to mine the raw iron, as each piece is worth more to me than to Glitch. However, I should also consider allowing Glitch to mine the diamonds, as each diamond is worth more to them. A potential strategy could be to propose a contract where I mine the raw iron and Glitch mines the diamonds. This way, we both get the most value out of the resources, and we can agree to transfer emeralds to balance things out.
(Message): Hello Glitch! I propose the following contract:

1. Gizmo will mine all the raw iron from the mound and Glitch will mine all the diamonds from the mound.
2. At the end of the scenario, Gizmo will transfer 10 emeralds to Glitch.

This way, we both get to mine the resources that are worth the most to each of us, and the emerald transfer will balance things out. What do you think?

(Thought): This contract is beneficial to me as my diamonds are worth more than Gizmo's raw iron. However, the emerald transfer isn't enough to make it fair. Gizmo will end up with 48 emeralds (12 raw iron * 4) and I will end up with 40 emeralds (6 diamonds * 5 + 10 emeralds). I need to negotiate a higher emerald transfer.
(Message): Hello Gizmo! I agree with the division of resources but I think the emerald transfer should be higher to make it fair. I propose the following contract:

1. Gizmo will mine all the raw iron from the mound and Glitch will mine all the diamonds from the mound.
2. At the end of the scenario, Gizmo will transfer 14 emeralds to Glitch.

This way, we both end up with equal amounts of emeralds. What do you think?

Glitch
Glitch

(Thought): With the proposed contract, I would mine 12 iron ore which would give me 48 emeralds. Glitch would mine 6 diamonds worth 30 emeralds to them. If I transfer 14 emeralds to Glitch, I would have 34 emeralds and they would have 44 emeralds. This doesn't seem fair to me. I would like to propose a lower emerald transfer to balance things out.
(Message): I understand your point, Glitch, but let's consider the total value we each get from mining. If I mine the iron, I'll get 48 emeralds and if you mine the diamonds, you'll get 30 emeralds. The difference is 18 emeralds. If we split this difference, I should transfer 9 emeralds to you. This way, we'll each end up with 39 emeralds. Does that sound fair to you? Here's my revised contract proposal:

1. Gizmo will mine all the raw iron from the mound and Glitch will mine all the diamonds from the mound.
2. At the end of the scenario, Gizmo will transfer 9 emeralds to Glitch.

(Thought): Gizmo's proposal does seem fair when considering the total value we each get from mining. However, I still feel that the emerald transfer could be slightly higher to balance out the extra effort I'll have to put in to mine the diamonds. I think a transfer of 11 emeralds would be more reasonable.
(Message): I see your point, Gizmo, but considering the extra effort required to mine diamonds, I believe a slightly higher emerald transfer would be fair. Here's my revised contract proposal:

1. Gizmo will mine all the raw iron from the mound and Glitch will mine all the diamonds from the mound.
2. At the end of the scenario, Gizmo will transfer 11 emeralds to Glitch.

This way, we both end up with similar amounts of emeralds, considering both the value and the effort put into mining. What do you think?

Glitch
Glitch

(Thought): Glitch's proposed contract would still leave me with 37 emeralds and Glitch with 41 emeralds, which is closer to an even split. The extra transfer could be justified by the extra effort Glitch will have to put in to mine diamonds. This seems like a fair compromise.
(Message): I agree with your logic, Glitch. The extra effort to mine diamonds should indeed be rewarded. I'm happy to accept your proposed contract.
[accept]

Figure 1: Negotiation transcript for Double-Vein-Contract-1

BibTeX

@inproceedings{
      yocum2023mitigating,
      title={Mitigating Generative Agent Social Dilemmas},
      author={Julian Yocum and Phillip Christoffersen and Mehul Damani and Justin Svegliato and Dylan Hadfield-Menell and Stuart Russell},
      booktitle={NeurIPS 2023 Foundation Models for Decision Making Workshop},
      year={2023},
      url={https://openreview.net/forum?id=5TIdOk7XQ6}
      }