Zachary Sunberg comments

Results 213 comments of


                                            Zachary Sunberg

Product of distributions

If anyone wants to take a stab at making the distributions in POMDPTools compatible with Distributions.jl, it would be a nice contribution.

I tried implementing this in #495 but it did not go very well. Distributions.jl is very focused on numerical distributions (see #490), so `Product([SparseCat([:a,:b,:c], [0.5, 0.2, 0.3]), BoolDistribution(1.0)])` has no...

ExplorationPolicies don't work with stepthrough

Yeah, the exploration policy interface was designed for reinforcement learning solvers where the exploration should be decayed, but it is not really a `Policy`. I would not object to a...

`DictPolicy` and special Q-learning based on key-value storage

@NeroBlackstone sorry that we never responded to this! This is actually something that people often want to do. If you're still interested in contributing it, I think we can integrate...

Support Hooks ?

So far, we have left the question of accessing additional information up to solver writers. The `solve_info` and `action_info` functions in POMDPTools sometimes output additional information. Can you describe what...

Support Hooks ?

I think it's best to try adding some hooks to that particular package and then generalize from that if we can find a way. In general one challenge is that...

Support Hooks ?

> But when I was going to bed last night, I got some inspirations. @NeroBlackstone , thanks for using your bedtime thoughts to try to improve this package! :) In...

Support Hooks ?

closing for now since this seems to be a solver-specific issue.

Make all POMDPTools distributions into Distributions.jl distributions

I started implementing this, but it is quite unsatisfying. Distributions.jl is very focused on numerical distributions, for example, `rand` actually falls back to `quantile` in many cases and has a...

`action` interface of exploration policies

I don't remember the details, but they are designed to change as the total number of calls (k) increases. i.e. to decay. I think they are used in things like...