acme icon indicating copy to clipboard operation
acme copied to clipboard

The program always executes behavior clone when running CQLLearner.

Open jiangjiadi opened this issue 2 years ago • 0 comments
trafficstars

When I run the cql algorithm, I found the algorithm only execute behavior clone. I checked the config used. The training step is 100 and the 'num_bc_iters' is set to 50. When I further dive to the source code of CQLLearner, I found the 'counts' in function 'step' has two keys "steps" and "walltime". image However, in the inplementation of 'step', the key used is "learner_steps". image The invalid key "learner_steps" makes the "cur_step" always be 0, thus causing the algorithm only execute behavior clone. When I correct the key "learner_steps" to "steps", the problem is solved. image

jiangjiadi avatar Jul 06 '23 06:07 jiangjiadi