blackdrops
blackdrops copied to clipboard
How to run BlackDROPS with GP-MI
Have you implemented the BlackDROPS with GP-MI algorithm that was proposed in your ICRA 2018 paper in this repo? I am very interested in that idea and wondering how to replicate your experimental results.
First of all, thank you for your interest.
Have you implemented the BlackDROPS with GP-MI algorithm that was proposed in your ICRA 2018 paper in this repo?
Yes and no. Yes because we have already implemented the GP-MI optimization procedure (see here), but no because we haven't included an example usage.
Let me create an example in the cartpole scenario (which is easy and fast to do), and I will ping you. Give me until 15th of June as I have a few urgent things to finish till then..
That would be great! I'll check the optimization procedure before you release an example. Thank you very much!
@urnotmeeto sorry for being late almost one month, but lots of things came up.
I have created a branch with an example of using GP-MI with the cartpole: gp_mi_example
. Compile everything and then you can run the example with: ./deps/limbo/build/exp/blackdrops/src/classic_control/cartpole_mi_simu -m -1 -r 1 -n 10 -b 5 -e 1 -u -s
. Replace simu
with graphic
to visualize what's going on. I am still debugging it for possible errors/mistakes (there is something fishy going on in the initial optimization), but it should be a good enough starting point for you and I do not want you to wait more.
The process starts by optimizing the mean model first (with an initial guess of the optimization variables of the mean --- different from the actual system), and the proceeds with the normal loop of optimizing the model and then the policy given the model. Beware that the model optimization will take much longer as well as the policy optimization (we are calling the mean function every-time we query the model).
@costashatz Great! I'll check it. Thank you!