dopamine icon indicating copy to clipboard operation
dopamine copied to clipboard

Dopamine duplicate Gym functionality

Open jarlva opened this issue 6 years ago • 9 comments

I spent a lot of time trying to understand the colab cartpole Gym example so to apply to a custom discrete Gym environment, which is similar to the cartpole gym environment and works fine with a Keras RL agent. I noticed that dopamine is using gym_lib.py in addition to the actual gym environment. For example, gym_lib.py contains variables already defined in the gym cartpole environment file. such as:

dopamine/discrete_domains/gym_lib.py:

CARTPOLE_MIN_VALS = np.array([-2.4, -5., -math.pi/12., -math.pi*2.])
CARTPOLE_MAX_VALS = np.array([2.4, 5., math.pi/12., math.pi*2.])
gin.constant('gym_lib.CARTPOLE_OBSERVATION_SHAPE', (4, 1))
gin.constant('gym_lib.CARTPOLE_OBSERVATION_DTYPE', tf.float32)

gym/envs/classic_control/cartpole.py:

self.x_threshold = 2.4
self.observation_space = spaces.Box(-high, high, dtype=np.float32)

This is confusing. OpenaAI gym encompass all the code necessary to create a complete environment object. With all the necessary plumbing//functions/variables, etc.. This approach is difficult to understand and build on. Would it be possible to make gym a drop-in for dopamine? This would greatly simplify and speed up dopamine adoption.

jarlva avatar Feb 12 '19 11:02 jarlva

thanks for your feedback! when we originally built dopamine for atari, we wrote the AtariPreprocessing class to have everything related to atari preprocessing contained in one place. there are a few things that are standard when training atari agents (see https://github.com/google/dopamine/blob/master/dopamine/discrete_domains/atari_lib.py#L240)

for consistency with that we created GymPreprocessing. we'll see if we can simplify the interface with non-atari gym environments so that it's easier to use.

psc-g avatar Feb 12 '19 13:02 psc-g

That would be very helpful. Thanks!

jarlva avatar Feb 12 '19 13:02 jarlva

Hello, is there is a plan to remove the duplicate settings to make it more streamlined and easy to use with GYM?

jarlva avatar Apr 24 '19 10:04 jarlva

I'm doing the same thing as @jheffez and i find difficult to adapt the code with all these duplicate settings. I've successfully adapted my custom discrete gym environment into the rllib library and it works fine; now i'm trying to do the same for dopamine, struggling a little bit.

fbbfnc avatar Apr 30 '19 10:04 fbbfnc

Echoing @psc-g 's earlier comment, I agree GymProcessing needs some streamlining. This is on our plate but -- if you have a solution ready for GymProcessing in particular, a PR is also welcome.

mgbellemare avatar May 01 '19 21:05 mgbellemare

@mgbellemare , thanks for the update! Ideally, GymProcessing will make Dopamine a drop-in to Gym.

Can you please share with us your roadmap/plans? For example, adding latest RL technics (like Simple) and if Dopamine and tensorflow/agents can benefit from synergy?

jarlva avatar May 02 '19 07:05 jarlva

hi, i've started looking into this. a few points:

  1. we could inherit things like min_vals and max_vals from gym to avoid the redundancy. is that mostly what you're after?
  2. the observation shapes and dtypes are a little trickier, as they need to be defined as gin constants so that we can inject them via the respective gin config files. i think there's a way to get this to fetch the values from gym, but it would add redirection, potentially at the risk of clarity.
  3. with regards to roadmap/plans, we have lots we'd like to do, but limited time :)
  4. we are in discussions with tensorflow/agents to see how dopamine and tf/agents could be more compatible, stay tuned!

psc-g avatar May 03 '19 13:05 psc-g

Goodluck i am always with dopamine and will try my best to contribute too someday.

iteachmachines avatar May 03 '19 13:05 iteachmachines

Hi @psc-g , 1, 2. OpenAi Gym exposes all functions. For ex, spaces is easily accessible via env.env.action_space. To get the number of actions:

import gym from gym import spaces from gym.utils import seeding

action_dim = env.env.action_space.n

To get state: state_dim = env.get_state_size() Take a look at this for more examples. So, it's beneficial, and simple, for Dopamine to grab all that good stuff directly from Gym.

  1. Do share...
  2. Great! Synergies are a good thing!

jarlva avatar May 04 '19 10:05 jarlva