Meta-RL-TwoStep-Task
Meta-RL-TwoStep-Task copied to clipboard
A question about TwoStepTask.step() function
Thanks for providing such a readable code.
If I am correct, you were compressing stage1 and stage2 into 1 trial. In this case, I think the next_state of this trial should be S1, because each trial should begin with stage1. However, in my simple simulation, the non-ep environment never returns S1 [ 1, 0,0]:

I wonder if it is a special design for the "incremental" case? I am not sure if I am on the same page with you. Can you offer a brief definition of "incremental", "episodic" in your README?
Thank you very much!!