curl icon indicating copy to clipboard operation
curl copied to clipboard

Questions about infinit bootstrap

Open geekyutao opened this issue 3 years ago • 2 comments

Hi, thank you for your code. I'm a little bit confused of the infinit bootstrap in https://github.com/MishaLaskin/curl/blob/8416d6e3869e38ca0e46fcbc54a2f784dc09d7fc/train.py#L269 . Will it be wrong when sampling at the end of an episode (where the next_obs is the start observation of the next episode)? It seems you simply ignore this.

geekyutao avatar Sep 18 '21 05:09 geekyutao

It seems in DMcontrol there is no true terminal state. So it allows infinte bootstrap.

yueyang130 avatar Apr 26 '22 14:04 yueyang130

For @geekyutao 's question, the point is that the next_ob will never be the start observation of the next episode. Because at the previous timestep, the next_ob is the terminal state and done is true (Note done_bool is alway false whereas done is true at the max step). Then env is reset and the ob is set to the start observation.

yueyang130 avatar Aug 06 '22 03:08 yueyang130