Atari¶

Atari 环境基于 Arcade Learning Environment。这个环境在现代强化学习的发展中起到了重要作用，因此我们希望我们的多智能体版本对于多智能体强化学习的发展有所帮助。

篮球 Pong

安装¶

可以通过以下方式安装这组环境的特有依赖项：

pip install 'pettingzoo[atari]'

使用 AutoROM 安装 ROM，或使用 rom_path 参数指定 Atari ROM 的路径（参见常见参数）。

用法¶

要使用随机智能体启动太空侵略者环境：

from pettingzoo.atari import space_invaders_v2

env = space_invaders_v2.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
    observation, reward, termination, truncation, info = env.last()

    if termination or truncation:
        action = None
    else:
        action = env.action_space(agent).sample() # this is where you would insert your policy

    env.step(action)
env.close()

游戏概述¶

大多数游戏有两个玩家，但战争领主（Warlords）和几个 Pong 变体游戏有四个玩家。

环境详情¶

ALE 已经得到了广泛研究，并发现了一些显著问题：

确定性：Atari 控制台是确定性的，因此智能体理论上可以记住精确的动作序列来最大化最终得分。这并不理想，因此我们鼓励使用SuperSuit 的 sticky_actions 包装器（参见下面的示例）。这是 “Machado 等人 (2018)，《重访 Arcade Learning Environment：通用智能体的评估协议和开放问题》” 中推荐的方法。
帧闪烁：由于硬件限制，Atari 游戏通常不会在每一帧都渲染所有精灵。相反，精灵（例如 Joust 中的骑士）有时隔一帧渲染一次，甚至（在 Wizard of Wor 中）隔三帧渲染一次。处理此问题的标准方法是计算前两个观测值像素wise的最大值（参见下面的示例了解实现方法）。

预处理¶

我们鼓励使用supersuit 库进行预处理。可以通过以下方式安装这组环境的特有依赖项：

pip install supersuit

以下是一些 Atari 预处理的示例用法：

import supersuit
from pettingzoo.atari import space_invaders_v2

env = space_invaders_v2.env()

# as per openai baseline's MaxAndSKip wrapper, maxes over the last 2 frames
# to deal with frame flickering
env = supersuit.max_observation_v0(env, 2)

# repeat_action_probability is set to 0.25 to introduce non-determinism to the system
env = supersuit.sticky_actions_v0(env, repeat_action_probability=0.25)

# skip frames for faster processing and less control
# to be compatible with gym, use frame_skip(env, (2,5))
env = supersuit.frame_skip_v0(env, 4)

# downscale observation for faster processing
env = supersuit.resize_v1(env, 84, 84)

# allow agent to see everything on the screen despite Atari's flickering screen problem
env = supersuit.frame_stack_v1(env, 4)

常见参数¶

所有 Atari 环境都具有以下环境参数：

# using space invaders as an example, but replace with any atari game
from pettingzoo.atari import space_invaders_v2

space_invaders_v2.env(obs_type='rgb_image', full_action_space=True, max_cycles=100000, auto_rom_install_path=None)

obs_type：此参数有三个可能的值：

‘rgb_image’（默认）- 生成像人类玩家看到的 RGB 图像。
‘grayscale_image’ - 生成灰度图像。
‘ram’ - 生成 Atari 控制台 RAM 的 1024 位观测值。

full_action_space：将此选项设置为 True 会将动作空间设置为完整的 18 个动作空间。将其设置为 False（默认）会移除重复的动作，仅保留唯一的动作。

max_cycles：游戏终止前的帧数（每个智能体可以采取的步数）。

auto_rom_install_path：使用 Farama-Foundation/AutoROM 工具安装的 AutoROM 的路径。这是你在安装 AutoROM 时指定的路径。例如，如果你使用 Atari 的 boxing 环境，库将在 /auto_rom_install_path/ROM/boxing/boxing.bin 查找 ROM。如果未指定此路径（值为 None），则库将在默认的 AutoROM 路径查找 ROM。

引用¶

Arcade Learning Environment 中的多人游戏在以下文献中被引入：

@article{terry2020arcade,
  Title = {Multiplayer Support for the Arcade Learning Environment},
  Author = {Terry, J K and Black, Benjamin},
  journal={arXiv preprint arXiv:2009.09341},
  year={2020}
}

Arcade Learning Environment 最初在以下文献中被引入：

@Article{bellemare13arcade,
  author = { {Bellemare}, M.~G. and {Naddaf}, Y. and {Veness}, J. and {Bowling}, M.},
  title = {The Arcade Learning Environment: An Evaluation Platform for General Agents},
  journal = {Journal of Artificial Intelligence Research},
  year = "2013",
  month = "jun",
  volume = "47",
  pages = "253--279",
}

Arcade Learning Environment 的各种扩展在以下文献中被引入：

@article{machado2018revisiting,
  title={Revisiting the arcade learning environment: Evaluation protocols and open problems for general agents},
  author={Machado, Marlos C and Bellemare, Marc G and Talvitie, Erik and Veness, Joel and Hausknecht, Matthew and Bowling, Michael},
  journal={Journal of Artificial Intelligence Research},
  volume={61},
  pages={523--562},
  year={2018}
}