Multi-agent-reinforcement-learning issues

????????????????

2

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

why?????????

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

why?????????????

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

?????????????

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

??????????????

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

code error

maddpg.py def update(self, batch_size): obs_batch, indiv_action_batch, indiv_reward_batch, next_obs_batch, global_state_batch, global_actions_batch, global_next_state_batch, done_batch = self.replay_buffer.sample(batch_size) for i in range(self.num_agents): obs_batch_i = obs_batch[i] indiv_action_batch_i = indiv_action_batch[i] indiv_reward_batch_i = indiv_reward_batch[i] next_obs_batch_i = next_obs_batch[i]...

yyds-xtt

regarding to policies and reward functions

1

Hi, May I ask how do you define more than one policy and reward function concurrently in a multi-agent setting? Thank you.

zyzhang1130

Multi-agent-reinforcement-learning
Multi-agent-reinforcement-learning copied to clipboard

Metadata

????????????????

why?????????

why?????????????

?????????????

??????????????

code error

regarding to policies and reward functions

← Metadata

Owner

Metadata

Multi-agent-reinforcement-learning Multi-agent-reinforcement-learning copied to clipboard

Metadata

????????????????

why?????????

why?????????????

?????????????

??????????????

code error

regarding to policies and reward functions

← Metadata

Owner

Metadata

Multi-agent-reinforcement-learning
Multi-agent-reinforcement-learning copied to clipboard