Questions about the documentation related to the RNN
使用多智能体MAPPO算法时,想要尝试用Basic_RNN替换Basic_MLP,在配置文件中同步修改use_rnn: True后,出现错误提示:
Traceback (most recent call last):
File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 22, in <module>
Agent.train(configs.running_steps // configs.parallels) # Train the model for numerous steps.
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 287, in train
self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 384, in run_episodes
policy_out = self.action(obs_dict=obs_dict, state=state, avail_actions_dict=avail_actions,
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 141, in action
rnn_hidden_critic_new, values_out = self.policy.get_values(observation=critic_input,
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/policies/gaussian_marl.py", line 176, in get_values
outputs = self.critic_representation[key](observation[key], *rnn_hidden[key])
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/rnn.py", line 63, in forward
output, hn = self.rnn(mlp_output, h)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 1117, in forward
raise RuntimeError(
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor
应该是数据维度的问题,查阅文档后并未发现有相关部分的说明,不知道还需修改环境代码的其他什么位置,以下是我动作、状态、观察空间:
self.state_space = Box(-np.inf, np.inf, shape=[7 * self.num_agents, ], dtype=np.float32)
self.observation_space = {agent: Box(-np.inf, np.inf, shape=[14, ], dtype=np.float32) for agent in self.agents}
self.action_space = {agent: Box(-1, 1, shape=[2, ], dtype=np.float32) for agent in self.agents}
请问还有哪里需要做出调整,谢谢答疑!
你好,如果要修改representation为RNN,需要配置以下信息:
use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0
你好,如果要修改representation为RNN,需要配置以下信息:
use_rnn: True rnn: "GRU" recurrent_layer_N: 1 fc_hidden_sizes: [64, ] recurrent_hidden_size: 64 N_recurrent_layers: 1 dropout: 0
您好,按此方法修改后,报错如下
Traceback (most recent call last):
File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 27, in <module>
Agent = MAPPO_Agents(config=configs, envs=envs) # Create a DDPG agent from XuanCe.
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 24, in __init__
super(MAPPO_Agents, self).__init__(config, envs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/ippo_agents.py", line 24, in __init__
self.policy = self._build_policy() # build policy
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 38, in _build_policy
A_representation = self._build_representation(self.config.representation, self.observation_space, self.config)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/base/agents_marl.py", line 217, in _build_representation
representation[key] = REGISTRY_Representation[representation_key](**input_representations)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/mlp.py", line 40, in __init__
self.output_shapes = {'state': (hidden_sizes[-1],)}
KeyError: -1
是否因为我state数据格式需要修改?
可参考这上面的参数配置,对比格式是否一致:https://github.com/agi-brain/xuance/blob/master/examples/mappo/mappo_mpe_configs/simple_spread_v3.yaml.
我的配置文件就是从simple_spread_v3.yaml修改而来,使用这个yaml也会报错:
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor
事实上直接使用simple_spread_v3.yaml运行mpe环境的测试,当配置文件修改为RNN相关设置后也会报错
Traceback (most recent call last):
File "/Users/hawkq/Desktop/frigatebird_multi/testrun.py", line 13, in <module>
runner.run()
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/runners/runner_marl.py", line 32, in run
self.agents.train(n_train_steps)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 287, in train
self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 420, in run_episodes
_, value_next = self.values_next(i_env=i, obs_dict=obs_dict[i], state=state[i],
TypeError: 'NoneType' object is not subscriptable
你好,请问在VDN或MADDPG这类算法上测试过吗?是否也存在同样问题?我需要判断据此判断一下问题出现在哪个环节
你好,请问在VDN或MADDPG这类算法上测试过吗?是否也存在同样问题?我需要判断据此判断一下问题出现在哪个环节
您好,因readthedocs提供的配置文件有限,仅将MADDPG算法配置文件修改并添加如下内容:
agent: "MADDPG" # the learning algorithms_marl
env_name: "fb"
env_id: "fb_v0"
env_seed: 1
continuous_action: True
learner: "MADDPG_Learner"
policy: "MADDPG_Policy"
representation: "Basic_RNN"
vectorize: "DummyVecMultiAgentEnv"
runner: "MARL"
distributed_training: False
use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0
representation_hidden_size: [] # the units for each hidden layer
actor_hidden_size: [64, ]
critic_hidden_size: [64, ]
activation: 'leaky_relu'
activation_action: 'sigmoid'
use_parameter_sharing: True
use_actions_mask: False
MADDPG可以运行,但仅单环境运行也会大量读写,不知是否是RNN的特性
在VDN中,配置文件关键如下:
agent: "VDN"
env_name: "fb"
env_id: "fb_v0"
env_seed: 1
continuous_action: True
learner: "VDN_Learner"
policy: "Mixing_Q_network"
representation: "Basic_MLP"
vectorize: "DummyVecMultiAgentEnv"
runner: "MARL"
distributed_training: False
use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0
representation_hidden_size: [64, ]
q_hidden_size: [64, ] # the units for each hidden layer
activation: "relu"
此时会报错:
Traceback (most recent call last):
File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 29, in <module>
Agent = VDN_Agents(config=configs, envs=envs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/vdn_agents.py", line 27, in __init__
self.policy = self._build_policy() # build policy
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/vdn_agents.py", line 44, in _build_policy
representation = self._build_representation(self.config.representation, self.observation_space, self.config)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/base/agents_marl.py", line 217, in _build_representation
representation[key] = REGISTRY_Representation[representation_key](**input_representations)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/mlp.py", line 40, in __init__
self.output_shapes = {'state': (hidden_sizes[-1],)}
KeyError: -1
这与上文报错相同
你好,请确认representation参数设置为“Basic_RNN”:
representation: "Basic_RNN"
VDN修改为representation: "Basic_RNN"后,由于动作连续,因此报错,但简单调整为离散动作后能够运行
VDN适用于离散动作
你好,请问在VDN或MADDPG这类算法上测试过吗?是否也存在同样问题?我需要判断据此判断一下问题出现在哪个环节
经过测试,VDN能够运行,MADDPG能够运行,MAPPO报错:
Traceback (most recent call last):
File "C:\Users\HawkQ\Desktop\frigatebird_multi\new_run.py", line 30, in <module>
Agent.train(configs.running_steps // configs.parallels) # Train the model for numerous steps.
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\core\on_policy_marl.py", line 287, in train
self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\core\on_policy_marl.py", line 384, in run_episodes
policy_out = self.action(obs_dict=obs_dict, state=state, avail_actions_dict=avail_actions,
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\multi_agent_rl\mappo_agents.py", line 141, in action
rnn_hidden_critic_new, values_out = self.policy.get_values(observation=critic_input,
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\policies\gaussian_marl.py", line 176, in get_values
outputs = self.critic_representation[key](observation[key], *rnn_hidden[key])
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\representations\rnn.py", line 63, in forward
output, hn = self.rnn(mlp_output, h)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\rnn.py", line 1117, in forward
raise RuntimeError(
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor
你好,请问你的问题解决了吗?
你好,请问你的问题解决了吗?
抱歉还没有,目前只能暂时不使用RNN进行训练