on-policy issues

Results 25 on-policy issues

Sort by recently updated

Model...

Why save only the latest model instead of the best-performing one? If I want to save the optimal model, what should be added?Thanks!!

bitbjt

Error when run ./train_mpe_spread.sh

When I tried to run ./train_mpe_spread.sh, I met the following issue: ``` obs_space: [Box(18,), Box(18,), Box(18,)] share_obs_space: [Box(54,), Box(54,), Box(54,)] act_space: [Discrete(5), Discrete(5), Discrete(5)] Traceback (most recent call last): File...

ChuangZhang1999

When I tried to train the code for smacv2, I encountered this error message——AssertionError: check recurrent policy!

The detailed errors are as follows： Traceback (most recent call last): File "../train/train_smac.py", line 260, in main(sys.argv[1:]) File "../train/train_smac.py", line 138, in main "check recurrent policy!") AssertionError: check recurrent policy!...

chenzihan1

missing "onpolicy.runner.separated.hanabi_runner_forward"

Hi there Thanks for the great repository, one question. I see here https://github.com/marlbenchmark/on-policy/blob/4769caf56a9b2ccb90866ae56f1d9c804432e63b/onpolicy/scripts/train/train_hanabi_forward.py#L162 that there is supposed to be a `Runner` for Hanabi separated but I can't find the file,...

AlbertoSinigaglia

question about reply buffer size in MAPPO

Thank you for your contribution to the RL community. I have some questions about the reply buffer setting in both shared and separated buffer settings. When I am training, I...

Gloriabhsfer

请问我该如何保存replay？

如题在StarCraft2_Env.py中找到了save_replay()函数，看起来和SMAC源代码中的函数一样但是我应该怎么用它？ SMAC的官方代码readme中说的很模糊请问您知道该怎么做吗？多谢！

BUPT-zeld151

init() got multiple values for argument 'device'

I run ./onpolicy/scripts/train_mpe_scripts/train_mpe_spread.sh after change 'algo' to mappo and user_name to my wandb user name in train_mpe_spread.sh. My train_mpe_spread.sh is as follows: ```text #!/bin/sh env="MPE" scenario="simple_spread" num_landmarks=3 num_agents=3 algo="mappo" #"rmappo"...

colourfulspring

on-policy
on-policy copied to clipboard

Metadata

Model...

Error when run ./train_mpe_spread.sh

When I tried to train the code for smacv2, I encountered this error message——AssertionError: check recurrent policy!

missing "onpolicy.runner.separated.hanabi_runner_forward"

question about reply buffer size in MAPPO

请问我该如何保存replay？

init() got multiple values for argument 'device'

Questions on the episode length of 1000 on gfootball env instead of a maximum env limit of 400

the value for the available_actions

fixed bug in initializing ValueNorm

← Metadata

Owner

Metadata

on-policy on-policy copied to clipboard

Metadata

← Metadata

Owner

Metadata

on-policy
on-policy copied to clipboard