POMDPs.jl
POMDPs.jl copied to clipboard
GenerativeBeliefMDP isterminal is only true when the support is entirely terminal
This isn't an error per say, but isterminal in GenerativeBeliefMDP is only true when the entire support is terminal, but when @gen is called, we sample single state from the belief, meaning isterminal(bmdp, b = false), but @gen(b, a) may fail because a terminal state is sampled. Is there a good reason to define "terminal belief" this way? I agree that checking if any terminal state is in the support might be overly reactive. This becomes a problem, for instance, when we want to generate a bunch of samples and early exit on a "terminal belief."
@FlyingWorkshop did you close this because it was resolved? If it can throw an error when it samples a terminal state, we should probably fix that. I think @gen should only sample non-terminal states.
@zsunberg, I closed it b/c I took a look at the source code again and realized that while isterminal(::GenerativeBeliefMDP, b) is true when all support states are terminal, the @gen block actually checks is the sampled state is terminal:isterminal(bmdp.pomdp, rand(rng, b)). There is an issue w/ sampling terminal states, but I think this is expected behavior b/c there's a overwritable handler for this.
I was testing this on TMaze and there would be an error when we fell back on the default handler b/c reward(::TMaze, ::TerminalState, ::Int) isn't defined. I ended up just implemented the reward function, so don't think this is still an issue.
@zsunberg was this resolved with https://github.com/JuliaPOMDP/POMDPs.jl/pull/559?
Well, "resolved" is an optimistic word, but yes, I think the behavior was changed in a positive way, so I am closing this :)