Oscar Dowson
Oscar Dowson
> to add a deterministic upper bound? I have no plans to do so. Note that the formulation SDDP.jl uses is non-standard in the literature. We can handle integer variables,...
A few points on the computational effort: * You're right that we could apply the upper bound in a limited subset of cases. But my experience is that leads to...
@zidanessf: I just spoke to some folks how are hoping to have a postdoc work on SDDP.jl. They mentioned the same upper bound :) I now understand how it works...
Interesting. Can you email me code to reproduce? If you have 140 scenarios, using multicut is likely to slow things down, so I would recommend just using single-cut.
Thanks for sending the code. Here is a minimal reproducer. ```Julia using SDDP using GLPK model = SDDP.MarkovianPolicyGraph( transition_matrices = [[0.5 0.5], [1.0 0.0; 0.4 0.6]], lower_bound = 0.0, optimizer...
@haoxiangyang89 #265 had some of the ideas we discussed the other day.
@guyichen09 (cc @mortondp) a use-case for Bayesian Optimization/multi-armed bandit stuff: Each iteration of SDDP.jl, we have three choices for our duality handler * ContinuousConicDuality (a.k.a. continuous benders) * StrengthenedConicDuality (a.k.a....
Yeah my suggestion for the reward would be ```Julia function reward(log::Vector{Log}) d_bound = abs(log[end].bound - log[end-1].bound) dt = log[end].time - log[end-1].time return d_bound / dt end ``` And the `log`...
We could use this bayesian learning in a few other contexts: ### Trajectory depth in cyclic graphs The default sampling scheme for the forward pass has some tunable parameters: https://github.com/odow/SDDP.jl/blob/493b48e256a88cb1cc898497d01a4d3245d3ffd0/src/plugins/sampling_schemes.jl#L38-L43...
@guyichen09 here's a potentially open question: is there any work on learning when the reward is not-stationary? In our case, we know that the reward for each arm will tend...