mbbl
mbbl copied to clipboard
Off-by-one error in `gym_pendulum`
Hi, just a minor issue here. The gym_pendulum
environment in here runs for 201 steps instead of 200 steps as the original gym pendulum does and I think it's because this line should be an inequality.
Code to see this issue
#!/usr/bin/env python3
import gym
from mbbl.env.env_register import make_env
env = gym.make('Pendulum-v0')
env.reset()
done = False
t = 0
while not done:
obs, reward, done, _ = env.step(env.action_space.sample())
t += 1
print(f'Gym Pendulum-v0: {t} steps')
env, _ = make_env('gym_pendulum', rand_seed=0)
env.reset()
done = False
t = 0
while not done:
obs, reward, done, _ = env.step(env._env.action_space.sample())
t += 1
print(f'mbbl pendulum: {t} steps')
Output
Gym Pendulum-v0: 200 steps
mbbl pendulum: 201 steps