OpenAIGym.jl icon indicating copy to clipboard operation
OpenAIGym.jl copied to clipboard

Identical values for state and next state while iterating episode

Open baggepinnen opened this issue 5 years ago • 3 comments

This commit 28d5953 introduced a bug. While iterating an episode, identifcal values are obtained for s and s1, e.g.

for (s,a,r,s1) in ep 
    ...
end


debug> s
2-element PyArray{Float64,1}:
 -0.4688434880873431   
  0.0003026530939645928

debug> s1
2-element PyArray{Float64,1}:
 -0.4688434880873431   
  0.0003026530939645928

baggepinnen avatar Apr 29 '19 12:04 baggepinnen

Yep, I've run into this issue in a different form in my own code. This is one of the reasons I hadn't merged #12, another important one being I haven't had time to work on/check it with latest PyCall.

I think I'll get a chance to clean this up in the next few days. The likely fix will be to rollback #12 - which could be done now as a stopgap if you wanted @iblis17 ? Sorry I wasn't around to comment there in the last few days. From memory, when I last checked, the speed increase was smaller than it was when I originally made the PR, and I think returning a new array matches the Python API better, and will be less unexpected for users.

If it does turn out to be significantly slower, then another path is to modify the Episode iterator to make a copy for Gym environments, but I'm not certain PyArray will be deepcopy-able without some modifications.

Anyway, it's important to me for things to be fast, so will try make sure I can keep them that way. Will hopefully have a look at #13 again too when I work on this.

JobJob avatar Apr 30 '19 15:04 JobJob

I think I'll get a chance to clean this up in the next few days. The likely fix will be to rollback #12 - which could be done now as a stopgap if you wanted?

Rollbacking is cheap in git. just go for it.

iblislin avatar Apr 30 '19 15:04 iblislin

something like git revert HEAD

iblislin avatar Apr 30 '19 15:04 iblislin