HiT-MAC
HiT-MAC copied to clipboard
Why the reward curve looks random?
I followed the instructions of your project to train the excuator. python main.py --env Pose-v1 --model multi-att-shap --workers 6
.
Here is the tensorboard results:
The reward curve during train is different from the results in the paper below.
Does I do wrong? How to reproduce the reward curve in your paper ?