Gym包的安装与使用（新旧版本问题，Atari游戏支持问题）

重要概念
gym
① The gym library is a collection of test problems — environments — that you can use to work out your reinforcement learning algorithms. These environments have a shared interface, allowing you to write general algorithms.
② 可以理解为提供了调用的统一接口。
③ 自带就有一些基本的小游戏（或者称之为环境），包括classical control，box2d，mujoco，toytext等类别的一些小游戏，具体游戏名可以去…\Lib\site-packages\gym\envs下相应的文件夹下查看，或者去…\Lib\site-packages\gym\envs\registration.py文件内查看。

ALE

对应的安装包是ale-py
- 取代的是atari-py

安装

pip install gym

安装基础版的gym。
Successfully installed cloudpickle-2.0.0 gym-0.23.1 gym-notices-0.0.6 numpy-1.22.3。
在envs文件夹下包含有classical control的5款游戏；box2d的4款游戏；mujoco的12+款游戏；toy_text的5款游戏。
pip install gym[atari]

除了安装gym外，还回安装ale-py。
pip install gym[all]

会安装许多其他的包，包括box2d-py，opencv-python，ale-py，mujoco-py，pygame。直接安装会报错，提示是mujoco-py必须单独安装，mujoco-py的安装有点麻烦，此处不再探索。

Hello world

import gym

env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000):
    env.render()
    env.step(env.action_space.sample())  # take a random action
env.close()

环境

gym.make(id)中id的命名方式是*[username/](env-name)-v(version)*

from gym import envs
for env in envs.registry.all():
    print(env.id)

输出结果
并不是所有的输出拿来都能用（主要是Atari相关的环境，有名字但是没有包的）

As of Gym v0.20 and onwards all Atari environments are provided via ale-py. We do recommend using the new v5 environments in the ALE namespace:

import gym

env = gym.make('ALE/Breakout-v5')

gym的历史版本

从0.20开始，gym转而用ale-py了，这里测试019版本时期gym的效果

pip install gym==0.19.0
pip install atari_py==0.2.6

0.19版本的gym和最新版的区别不是很大

安装0.2.6版本的atari，相关目录下会有需要的ROM。

但是测试时会报错

总结

总的来看，老版gym+atari-py的组合和新版gym+ale-py的区别主要在

新版组合想要用Atari的Rom时，需要自己下载
使用新版的gym时，调用atari游戏时不管是不是v5版本的，都要依照ale-py给出的渲染模式，即在程序创建环境时制定render_mode，后续程序中不再使用render函数

# 新版
import gym

env = gym.make('Breakout-v0', render_mode='human')
env.reset()
for _ in range(10000):
    result = env.step(env.action_space.sample())  # take a random action
env.close()

# 老版
import gym

env = gym.make('Breakout-v0')
env.reset()
for _ in range(10000):
    env.render()
    result = env.step(env.action_space.sample())  # take a random action
env.close()

扩展知识：ALE相关

单靠ALE开发强化学习代码

从上面的说明文档中找到的一段示例代码，说明ale-py本身该怎么用

import sys
from random import randrange
from ale_py import ALEInterface

def main(rom_file):
    ale = ALEInterface()
    ale.setInt('random_seed', 123)
    ale.loadROM(rom_file)

    # Get the list of legal actions
    legal_actions = ale.getLegalActionSet()
    num_actions = len(legal_actions)

    total_reward = 0
    while not ale.game_over():
      a = legal_actions[randrange(num_actions)]
      reward = ale.act(a)
      total_reward += reward

    print(f'Episode ended with score: {total_reward}')

if __name__ == '__main__':
    if len(sys.argv) < 2:
      print(f"Usage: {sys.argv[0]} rom_file")
      sys.exit()

    rom_file = sys.argv[1]
    main(rom_file)

与gym结合使用
The ALE now natively supports OpenAI Gym.

Although you could continue using the legacy environments as is we recommend using the new v5 environments

import gym
import ale_py

env = gym.make('ALE/Breakout-v5')

在创建环境时不推荐使用render，推荐使用以下做法

import gym

env = gym.make('Breakout-v0', render_mode='rgb_array')
env.reset()
_, _, _, metadata = env.step(0)
assert 'rgb_array' in metadata

The render_mode argument supports either human | rgb_array. If rgb_array is specified we’ll return the full RGB observation in the metadata dictionary returned after an agent step.

在给出环境ID时，传统的方法是使用后缀加版本的方式，这些方式也还保留着；新的版本不再使用后缀了，后缀所表达的含义推荐使用关键词给出。

The legacy game IDs, environment suffixes -NoFrameskip, -Deterministic, and versioning -v0, -v4 remain unchanged.

We do suggest that users transition to the -v5 versioning which is contained in the ALE
namespace.

With the new -v5 versioning we don’t support any ID suffixes such as -NoFrameskip
or -Deterministic, instead you should configure the environment through keyword arguments as such:

import gym

env = gym.make('ALE/Breakout-v5',
    obs_type='rgb',                   # ram | rgb | grayscale
    frameskip=5,                     # frame skip
    mode=0,                           # game mode, see Machado et al. 2018
    difficulty=0,                     # game difficulty, see Machado et al. 2018
    repeat_action_probability=0.25,   # Sticky action probability
    full_action_space=True,           # Use all actions
    render_mode=None                  # None | human | rgb_array
)

可以接受的命名包括
- Pong-v0
  - PongNoFrameskip-v0
  - PongDeterministic-v0
- Pong-v4
  - PongNoFrameskip-v0
  - PongDeterministic-v4
- ALE/Pong-v5

安装ROM

ale-py支持的游戏在上面的说明文档有列出。

安装ale-py自带的有一个游戏叫Tetris（俄罗斯方块）。使用如下代码，结合pygame，可以绘出图像。

import gym

env = gym.make('ALE/Tetris-v5', render_mode='human')
env.reset()
for _ in range(1000):
    result = env.step(env.action_space.sample())  # take a random action
env.close()