rlkit
rlkit copied to clipboard
Copy vs Deepcopy in SAC
I'm using a previous version of SAC (with separate Q and V value function) but noticed something strange.
there is this line in twin_sac.py:
self.target_vf = vf.copy()
Changing it to a deepcopy results in very different training curves (see curves attached):
from copy import deepcopy
self.target_vf = deepcopy(vf)
vf.copy() is defined as such:
def copy(self):
copy = Serializable.clone(self)
ptu.copy_model_params_from_to(self, copy)
return copy
It looks like it should do essentially the same thing as deepcopy, so what's causing the difference..

Yeah, that's really odd. Can you check if deepcopy copies the weight values as well?
@richardrl Did you ever get around to seeing what's going on? Also, which curve is which?
I have not figured out why this is happening yet. The .copy() is the orange curve that actually trains. This is just on picknplace with the fetch robotics task.
Did you check if deepcopy copies the weights over (as reference or as value)?