Zhendong Wang
Zhendong Wang
@dssrgu Did you get similar results to the values reported in D4RL paper? I both tried the paper hyperparameters(policy_lr = 3e-5, lagrange_thresh=10.0) and the recommended one in this github (policy_lr...
@cangcn Actually, with the github code and the hyperparameters recommended in Readme file, I can not reproduce the reported results in D4RL paper, even in Gym tasks. I tried both...
I am not sure which model selection method you are using. If you are using offline, the model training should stop before the critic values go crazy. For online setting,...
To be honest, I am not very sure. Since we build our code upon the StyleGAN2 (NVIDIA code base), we add the copyright at the very top.
We follow the same way as the StyleGAN paper, and show it here https://github.com/Zhendong-Wang/Diffusion-GAN?tab=readme-ov-file#data-preparation.
The performance curve for antmaze tasks exhibits fluctuations, which might be attributed to the aggressive Q-learning. We proposed an offline selection method based on diffusion loss to let it stop...
It is a parameter to control the speed of adjusting the `diffusion.p`. If ada_kimg is large, `diffusion.p` will increase and decrease slowly. `ada_kimg` could also be seen as how many...
We use 100 and this is the default value in the code.
We provide detailed steps for Simple Plugin with Diffusion-GAN in https://github.com/Zhendong-Wang/Diffusion-GAN#simple-plug-in. Generally there is only three steps. 1. prepare diffusion.py . 2. use diffuion.py to augment the inputs for discriminator....
During my training, usually it is stable especially for mojoco tasks. Which environment are you testing? If it happens, maybe you can moniter the q-value function loss. Usually decreasing the...