Steven H. Wang

Results 15 comments of Steven H. Wang

Would be resolved by https://github.com/HumanCompatibleAI/imitation/pull/221

We now have a SB3-like `.train(total_timesteps)` interface for GAIL, AIRL, DAgger, and BC. Also we have Sacred scripts for each of these algorithms, though they are not unified into a...

I don't recall the precise meaning of "air" and "none" in the dataset and the action space. I think that "none" means to leave the mainhand unmodified, and "air" means...

Thanks for checking the refactor! `git bisect` could be helpful here for pinning down the commit that caused this bug (set https://github.com/minerllabs/minerl/commit/8cc204360fdae39d18808facb21873baaed98b1c "bad" and v3.4.2 "good") if someone has time...

@sjn4048: "none" is used as a no-op equip action, when the player in our dataset doesn't change the item currently in hand. "air" is used when the player changes the...

This is true for all environments -- in fact we [explicitly account for this delay in BASALT tests]( https://github.com/minerllabs/minerl/blob/097b36b3db335ca6e2ea2ed44590d343ee27c732/tests/item_variants_integration_test.py#L159-L164) now. @brandonhoughton told me the reason why we have this delay...

Could you post the name of this demonstration? (Of the form `adjective-plant-animal-#-start_timestamp-end_timestamp`)

Thanks for pointing this out -- I'll keep an eye out for these problems when I process and upload the next version of the dataset (I expect that this will...

> It could be that the human placed a block and got reward a second time for mining a log That makes sense, since what we are actually checking to...