typewriter
typewriter copied to clipboard
Exploration access to environment for forward simulation
Hi,
I stumbled upon the following potential improvement, I am hacking it right now, but it would be great to have a proper solution. MCTS and other forward simulation techniques must have access to clones of the environment to execute rollouts. There is no way to pass the Exploration Policies the actual instantiated environment so they can perform the forward search.
For the purpose of illustration, this is the hack:
graph_manager.verify_graph_was_created()
env = graph_manager.environments[0]
graph_manager.top_level_manager.agents['agent'].exploration_policy.set_environment(env)
Being able to pass the instantiated environment as suggested in https://github.com/NervanaSystems/coach/issues/212 would be a potential workaround although not a solution.