Th Real Joker

Results 1 comments of Th Real Joker

I found the root of issue, that's because when I connect the policy_forward to collector (`train_kwargs["policy"] = self.policy_forward`), for every iteration (step), the collector spawns another process and runs the...