minerl icon indicating copy to clipboard operation
minerl copied to clipboard

delayed inventory update on MineRLObtainDiamond-v0

Open SaminYeasar opened this issue 3 years ago • 3 comments

Hi, I think there's a delay in updating the inventory in MineRLObtainDiamond-v0. Here I have saved the experiences (reward, frame, inventory, etc.) during evaluation. At timestep l=429 reward is 1, while the log is zero, log appears to be 1 at timestep l=431. Similarly, I get reward 2 and plank appears next-timestep. While in Human data the inventory updates at an instant. Screenshot from 2021-04-02 04-17-52

SaminYeasar avatar Apr 02 '21 09:04 SaminYeasar

I can confirm the reward/inventory change mismatch. Attaching a notebook I used to reproduce it using a set seed and hardcoded actions. test reward vs inventory change by frame.ipynb.zip

KarolisRam avatar Apr 02 '21 10:04 KarolisRam

This is true for all environments -- in fact we explicitly account for this delay in BASALT tests now.

@brandonhoughton told me the reason why we have this delay at one point. Could you repeat the reason here in case someone wants to work on this in the future?

Also it probably would be a good doc change for us to at least document or warn about this problem in the observation space docs: https://minerl.readthedocs.io/en/latest/environments/handlers.html#visual-observations-pov-third-person

shwang avatar Jul 12 '21 22:07 shwang

Some handlers work on information from the server - so we do client tick, then server tick for a step. If you need the server to tell you something it won't be processed until the next tick. That being said I'd expect that you can claim things in your own inventory on pickup right away - likely this is a result of the reward handler being on the server so it sees the change before you, sends it via a custom message to the client. Then on the next step the change is visible to the client

brandonhoughton avatar Jul 13 '21 17:07 brandonhoughton