QDax
QDax copied to clipboard
Double check optimizer state re-initialisation during PGAME training
I have got a doubt on the proper use of the opt states in the PG mutation of PGAME. I fear the opt state is not properly re-initialised from one PG mutation to the other. I think it's worth double-checking this.