HandyRL icon indicating copy to clipboard operation
HandyRL copied to clipboard

feature: apply lambda=1 in the timestep that there is no value output

Open YuriCat opened this issue 3 years ago • 1 comments

For training involving steps where the outputted value does not exist, it is necessary to set lambda to 1 locally.

I am not sure if VTrace is correct in this.

YuriCat avatar Jan 31 '22 00:01 YuriCat

This is ugly and complicated, I think...

YuriCat avatar Jan 31 '22 09:01 YuriCat

VTrace looks working well in experiments with Geister.

YuriCat avatar Jan 20 '23 06:01 YuriCat