multiagent-particle-envs icon indicating copy to clipboard operation
multiagent-particle-envs copied to clipboard

Compatibility with openai/baselines

Open mmmikael opened this issue 6 years ago • 3 comments

Are those environments compatible with OpenAI baselines implementation?

At first sights, it looks like the agents in openai/baselines don't support environments with an observable list.

For example the code below gives the exception:

~/tmp/baselines/baselines/deepq/deepq.py in learn(env, network, seed, lr, total_timesteps, buffer_size, exploration_fraction, exploration_final_eps, train_freq, batch_size, print_freq, checkpoint_freq, checkpoint_path, learning_starts, gamma, target_network_update_freq, prioritized_replay, prioritized_replay_alpha, prioritized_replay_beta0, prioritized_replay_beta_iters, prioritized_replay_eps, param_noise, callback, load_path, **network_kwargs)
    202         make_obs_ph=make_obs_ph,
    203         q_func=q_func,
--> 204         num_actions=env.action_space.n,
    205         optimizer=tf.train.AdamOptimizer(learning_rate=lr),
    206         gamma=gamma,

AttributeError: 'list' object has no attribute 'n'

Code that instantiates a baseline agent with a multiagent environment:

from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
from baselines.run import get_learn_function

from multiagent.environment import MultiAgentEnv
import multiagent.scenarios as scenarios

common_kwargs = dict(total_timesteps=30000, network="mlp", gamma=1.0, seed=0)

learn_kwargs = {
    'a2c' : dict(nsteps=32, value_network='copy', lr=0.05),
    'acktr': dict(nsteps=32, value_network='copy'),
    'deepq': dict(total_timesteps=20000),
    'ppo2': dict(value_network='copy'),
    'trpo_mpi': {}
}
alg = "deepq"

kwargs = common_kwargs.copy()
kwargs.update(learn_kwargs[alg])
learn_fn = lambda e: get_learn_function(alg)(env=e, **kwargs)

def env_fn():
    scenario = scenarios.load("simple_tag.py").Scenario()
    world = scenario.make_world()
    env = MultiAgentEnv(world, scenario.reset_world, scenario.reward, scenario.observation, scenario.benchmark_data)
    return env

env = SubprocVecEnv([env_fn])
model = learn_fn(env)

mmmikael avatar Sep 26 '18 19:09 mmmikael

Hello @mmmikael . I too want to use baselines in order to have the latest algorithms to use in my multi-agent machine learning project. Did you ever find a solution for this, or code a workaround?

Zamiell avatar Jan 29 '19 05:01 Zamiell

Hi @mmmikael do share the solution if you are able to work with multi agent in open ai baselines

Usmaniatech avatar Jul 24 '19 10:07 Usmaniatech

@mmmikael @Zamiell @Usmaniatech did any of you rectified the issue, if so do share the solution

indhra avatar Apr 22 '20 17:04 indhra