Zhendong Wang comments

Results 41 comments of


                                            Zhendong Wang

QF_Loss backprops policy network

@dssrgu Did you get similar results to the values reported in D4RL paper? I both tried the paper hyperparameters(policy_lr = 3e-5, lagrange_thresh=10.0) and the recommended one in this github (policy_lr...

QF_Loss backprops policy network

@cangcn Actually, with the github code and the hyperparameters recommended in Readme file, I can not reproduce the reported results in D4RL paper, even in Gym tasks. I tried both...

Bad performence on pen environment

I am not sure which model selection method you are using. If you are using offline, the model training should stop before the critic values go crazy. For online setting,...

License

To be honest, I am not very sure. Since we build our code upon the StyleGAN2 (NVIDIA code base), we add the copyright at the very top.

About the dataset preprocessing

We follow the same way as the StyleGAN paper, and show it here https://github.com/Zhendong-Wang/Diffusion-GAN?tab=readme-ov-file#data-preparation.

Reproduce in Antmaze?

The performance curve for antmaze tasks exhibits fluctuations, which might be attributed to the aggressive Q-learning. We proposed an offline selection method based on diffusion loss to let it stop...

regarding the setting of the hyperparameter "ada_kimg"

It is a parameter to control the speed of adjusting the `diffusion.p`. If ada_kimg is large, `diffusion.p` will increase and decrease slowly. `ada_kimg` could also be seen as how many...

regarding the setting of the hyperparameter "ada_kimg"

We use 100 and this is the default value in the code.

diffusion.py

We provide detailed steps for Simple Plugin with Diffusion-GAN in https://github.com/Zhendong-Wang/Diffusion-GAN#simple-plug-in. Generally there is only three steps. 1. prepare diffusion.py . 2. use diffuion.py to augment the inputs for discriminator....

Questions about overfitting & Q-value loss weight

During my training, usually it is stable especially for mojoco tasks. Which environment are you testing? If it happens, maybe you can moniter the q-value function loss. Usually decreasing the...