POMDPs.jl icon indicating copy to clipboard operation
POMDPs.jl copied to clipboard

Support Hooks ?

Open NeroBlackstone opened this issue 2 years ago • 7 comments

Is there any counterpart of Hooks in POMDPs.jl? As I know the only way to get information from pomdps.jl solver, is to copy and edit the solver source code. (it's my solution... It's hard to plot if we don't have hooks.

I don't know if the maintainer of POMDPs.jl also thinks this is a useful feature.

And I wonder how to implement it.

NeroBlackstone avatar Jun 12 '23 18:06 NeroBlackstone

So far, we have left the question of accessing additional information up to solver writers. The solve_info and action_info functions in POMDPTools sometimes output additional information.

Can you describe what information you are trying to get out of what solver, and perhaps we can think about generalizing from that point.

zsunberg avatar Jun 12 '23 18:06 zsunberg

Thanks for your reply. For example, solvers in TabularTDLearning.jl, the solver will evaluate trained policy every eval_every episode. We want to get the average reward of the trained policy trajectory while algorithm running.

NeroBlackstone avatar Jun 12 '23 18:06 NeroBlackstone

render function has a similar concept, but it's for the problem.

NeroBlackstone avatar Jun 12 '23 19:06 NeroBlackstone

I think it's best to try adding some hooks to that particular package and then generalize from that if we can find a way.

In general one challenge is that we have fairly different types of solvers in the POMDPs.jl ecosystem

  1. Offline optimization solvers like SARSOP
  2. Online tree search solvers like POMCP and DESPOT
  3. Reinforcement Learning solvers like tabular td learning

The hooks for these different types might be very different.

zsunberg avatar Jun 16 '23 18:06 zsunberg

Yes, I agree with @zsunberg , since different types of solvers exist, maybe we never have a unified solution.

But when I was going to bed last night, I got some inspirations.

We could pass a callback function to solve function.

like:

 function solve(f::Function,solver::QLearningSolver, mdp::MDP)
    # codes....

    f(episode,average_reward)

    # codes...
end
solve(qsolver,mdp) do episode,average_reward
    # collect data!
end
# plot!

Unfortunately, it's a break change. Maybe we could define callback as optional args.

But I still think at least we could propose a "hook convention".

NeroBlackstone avatar Aug 09 '23 15:08 NeroBlackstone

Maybe we could directly return data in solve_info(), but compared to the callback function, we could not get data while solver running.

The callback function is useful for long-time algorithms, so we can update plots to visually check algorithm status.

I still don't know what is best practice, since there is no solver implementing this, maybe we could implement one to show the right way.

Feel free to close this issue. :)

NeroBlackstone avatar Aug 09 '23 15:08 NeroBlackstone

But when I was going to bed last night, I got some inspirations.

@NeroBlackstone , thanks for using your bedtime thoughts to try to improve this package! :)

In general, I like this proposal, but there is one hard question related to the diversity of solvers: What arguments should be passed to the callback?

I also think that a better first step would be to add callbacks to individual solvers as solver options, for instance, it could be used like this

solver = NativeSARSOP.SARSOPSolver() do tree, alphas
    # print statistics from the tree or something
end
solve(solver, m)

One more note:

Unfortunately, it's a break change.

I don't think this is actually a breaking change, because we could define solve(f, solver, m) = solve(solver, m) as a fallback.

zsunberg avatar Aug 10 '23 02:08 zsunberg

closing for now since this seems to be a solver-specific issue.

zsunberg avatar Jun 12 '24 22:06 zsunberg