POMDPs.jl Support Hooks ?

Is there any counterpart of Hooks in POMDPs.jl? As I know the only way to get information from pomdps.jl solver, is to copy and edit the solver source code. (it's my solution... It's hard to plot if we don't have hooks.

I don't know if the maintainer of POMDPs.jl also thinks this is a useful feature.

And I wonder how to implement it.

Jun 12 '23 18:06 NeroBlackstone

So far, we have left the question of accessing additional information up to solver writers. The solve_info and action_info functions in POMDPTools sometimes output additional information.

Can you describe what information you are trying to get out of what solver, and perhaps we can think about generalizing from that point.

Jun 12 '23 18:06 zsunberg

Thanks for your reply. For example, solvers in TabularTDLearning.jl, the solver will evaluate trained policy every eval_every episode. We want to get the average reward of the trained policy trajectory while algorithm running.

Jun 12 '23 18:06 NeroBlackstone

render function has a similar concept, but it's for the problem.

Jun 12 '23 19:06 NeroBlackstone

I think it's best to try adding some hooks to that particular package and then generalize from that if we can find a way.

In general one challenge is that we have fairly different types of solvers in the POMDPs.jl ecosystem

Offline optimization solvers like SARSOP
Online tree search solvers like POMCP and DESPOT
Reinforcement Learning solvers like tabular td learning

The hooks for these different types might be very different.

Jun 16 '23 18:06 zsunberg

Yes, I agree with @zsunberg , since different types of solvers exist, maybe we never have a unified solution.

But when I was going to bed last night, I got some inspirations.

We could pass a callback function to solve function.

like:

 function solve(f::Function,solver::QLearningSolver, mdp::MDP)
    # codes....

    f(episode,average_reward)

    # codes...
end

solve(qsolver,mdp) do episode,average_reward
    # collect data!
end
# plot!

Unfortunately, it's a break change. Maybe we could define callback as optional args.

But I still think at least we could propose a "hook convention".

Aug 09 '23 15:08 NeroBlackstone

Maybe we could directly return data in solve_info(), but compared to the callback function, we could not get data while solver running.

The callback function is useful for long-time algorithms, so we can update plots to visually check algorithm status.

I still don't know what is best practice, since there is no solver implementing this, maybe we could implement one to show the right way.

Feel free to close this issue. :)

Aug 09 '23 15:08 NeroBlackstone

But when I was going to bed last night, I got some inspirations.

@NeroBlackstone , thanks for using your bedtime thoughts to try to improve this package! :)

In general, I like this proposal, but there is one hard question related to the diversity of solvers: What arguments should be passed to the callback?

I also think that a better first step would be to add callbacks to individual solvers as solver options, for instance, it could be used like this

solver = NativeSARSOP.SARSOPSolver() do tree, alphas
    # print statistics from the tree or something
end
solve(solver, m)

One more note:

Unfortunately, it's a break change.

I don't think this is actually a breaking change, because we could define solve(f, solver, m) = solve(solver, m) as a fallback.

Aug 10 '23 02:08 zsunberg

closing for now since this seems to be a solver-specific issue.

Jun 12 '24 22:06 zsunberg

POMDPs.jl POMDPs.jl copied to clipboard

Support Hooks ?

POMDPs.jl
POMDPs.jl copied to clipboard