rlkit icon indicating copy to clipboard operation
rlkit copied to clipboard

Copy vs Deepcopy in SAC

Open richardrl opened this issue 6 years ago • 4 comments

I'm using a previous version of SAC (with separate Q and V value function) but noticed something strange.

there is this line in twin_sac.py: self.target_vf = vf.copy()

Changing it to a deepcopy results in very different training curves (see curves attached):

        from copy import deepcopy
        self.target_vf = deepcopy(vf)

vf.copy() is defined as such:

  def copy(self):
       copy = Serializable.clone(self)
       ptu.copy_model_params_from_to(self, copy)
       return copy

It looks like it should do essentially the same thing as deepcopy, so what's causing the difference..

deepcopy

richardrl avatar Aug 09 '19 02:08 richardrl

Yeah, that's really odd. Can you check if deepcopy copies the weight values as well?

vitchyr avatar Aug 10 '19 00:08 vitchyr

@richardrl Did you ever get around to seeing what's going on? Also, which curve is which?

vitchyr avatar Sep 06 '19 06:09 vitchyr

I have not figured out why this is happening yet. The .copy() is the orange curve that actually trains. This is just on picknplace with the fetch robotics task.

richardrl avatar Oct 22 '19 21:10 richardrl

Did you check if deepcopy copies the weights over (as reference or as value)?

vitchyr avatar Oct 26 '19 12:10 vitchyr