botbowl icon indicating copy to clipboard operation
botbowl copied to clipboard

Wrong info in Gym tutorial

Open njustesen opened this issue 2 years ago • 6 comments

I don't believe this is true in https://njustesen.github.io/botbowl/gym.html:

"The action space is discrete, the action is an int in the range 0 <= action_idx < len(action_mask)."

njustesen avatar Jun 10 '22 11:06 njustesen

Can you elaborate? From what I can tell it's correct.

mrbermell avatar Jun 10 '22 11:06 mrbermell

If I can choose between blocking player A or player B, the sentence in the tutorial says that my action integer can be 0 or 1 but since it is a spatial action it has to be higher than the number of number of non-spatial actions to get past if action_idx < len(self.env_conf.simple_action_types): in _compute_action(self, action_idx: Optional[int], flip: Optional[bool] = None) -> List[Optional[Action]]:.

Instead, the integer is in the range [0, len(action_space)] which is implicit.

njustesen avatar Jun 10 '22 11:06 njustesen

Unless len(action_mask)=len(action_space) but that's just confusing, right?

njustesen avatar Jun 10 '22 11:06 njustesen

Thanks for the clarification, I see your point and agree. We should explain how the action mask works here. I'll see what I can do!

mrbermell avatar Jun 10 '22 11:06 mrbermell

How about something along these lines?

Action space

In botbowl's core engine all actions have a type, and some of the types also require a position. Read more about actions in the scripted bot tutorials. The gym environment has unrolled the spatial dimension into a one dimensional action space (see picture below). By doing so it becomes easy to use state-of-the-art algorithms, but it's worth considering that compared to many of the standard reinforcement learning benchmarks we have orders of magnitude larger action space.

The action of the environment in an integer, let's say action_idx = 352. You call env.step(action_idx) to step the environment with your action. But not all actions are legal at all times, this is where the action mask comes in. The action_mask is a vector of booleans that represents the legal actions, to check if your action action_idx is legal simply check if action_mask[action_idx] is true.

image

mrbermell avatar Jun 13 '22 21:06 mrbermell

This is better!

If the scripted bot tutorials contain important info about the action space, I think it should be included here. What are the paragraphs you are thinking of?

njustesen avatar Jun 21 '22 11:06 njustesen