Rui

Results 5 comments of Rui

csxeba is right, A2C and A3C are on-policy methods. Old datas are sampled by old policy, they are clearly not from the same distribution. We usually use a replay buffer...

I get the same problem because the taskpath is not a dict, using the follow codes to fix this: ``` if type(taskpath) == dict: return taskpath['dirname'].split('/')[-1].split('-')[0] ` else: return taskpath.dirname.split('/')[-1].split('-')[0]...

> Got the video part going and it works fine (tested on SMB1), but still no way to save. > > I saw that this is archived, but still, is...

> Hello, > It is given in setup.py that we can use OpenAI gym version >= 0.9.1 and v 0.9.1 works fine for me. > Thanks Thanks!

> Hi! > > I've solved by adding: > > ``` > with torch.autocast("cuda"): > trainer.train() > ``` This solves my problem, thanks!