DeepQLearning.jl
DeepQLearning.jl copied to clipboard
Action masking feature (legal actions)
POMDPs.jl supports state-dependent action spaces
However, DeepQLearning.jl is always picking the full action space.
That's because the solve
enumerates the actions once here, hands them into the policy, which are broadly used there after.
Do you think of a way to have action masking with the current implementation ?