Tianshou:基本 API 用法¶
本教程是关于如何在 PettingZoo 环境中使用 Tianshou 的一个简单示例。
它演示了在 剪刀石头布 环境中两个 随机策略 智能体之间的博弈。
环境设置¶
要跟随本教程,你需要安装下面显示的依赖。建议使用新创建的虚拟环境以避免依赖冲突。
numpy<2.0.0
pettingzoo[classic]>=1.23.0
packaging>=21.3
tianshou==0.5.0
代码¶
以下代码应该可以无障碍运行。注释旨在帮助你理解如何在 PettingZoo 中使用 Tianshou。如果你有任何问题,请随时在 Discord 服务器 中提问。
"""This is a minimal example to show how to use Tianshou with a PettingZoo environment. No training of agents is done here.
Author: Will (https://github.com/WillDudley)
Python version used: 3.8.10
Requirements:
pettingzoo == 1.22.0
git+https://github.com/thu-ml/tianshou
"""
from tianshou.data import Collector
from tianshou.env import DummyVectorEnv, PettingZooEnv
from tianshou.policy import MultiAgentPolicyManager, RandomPolicy
from pettingzoo.classic import rps_v2
if __name__ == "__main__":
# Step 1: Load the PettingZoo environment
env = rps_v2.env(render_mode="human")
# Step 2: Wrap the environment for Tianshou interfacing
env = PettingZooEnv(env)
# Step 3: Define policies for each agent
policies = MultiAgentPolicyManager([RandomPolicy(), RandomPolicy()], env)
# Step 4: Convert the env to vector format
env = DummyVectorEnv([lambda: env])
# Step 5: Construct the Collector, which interfaces the policies with the vectorised environment
collector = Collector(policies, env)
# Step 6: Execute the environment with the agents playing for 1 episode, and render a frame every 0.1 seconds
result = collector.collect(n_episode=1, render=0.1)