typewriter icon indicating copy to clipboard operation
typewriter copied to clipboard

Exploration access to environment for forward simulation

Open redknightlois opened this issue 5 years ago • 4 comments

Hi,

I stumbled upon the following potential improvement, I am hacking it right now, but it would be great to have a proper solution. MCTS and other forward simulation techniques must have access to clones of the environment to execute rollouts. There is no way to pass the Exploration Policies the actual instantiated environment so they can perform the forward search.

For the purpose of illustration, this is the hack:

graph_manager.verify_graph_was_created()
env = graph_manager.environments[0]
graph_manager.top_level_manager.agents['agent'].exploration_policy.set_environment(env)

Being able to pass the instantiated environment as suggested in https://github.com/NervanaSystems/coach/issues/212 would be a potential workaround although not a solution.

redknightlois avatar Mar 04 '19 02:03 redknightlois