gflownet icon indicating copy to clipboard operation
gflownet copied to clipboard

The proxy should not be copied into each environment instance

Open alexhernandezgarcia opened this issue 1 year ago • 1 comments

Currently, the proxy is set as an attribute of the environments and the base environment implements the methods proxy2reward() and reward2proxy() that determine the conversion between proxy outputs and reward. The environment also implements the methods reward() and reward_batch(), which call the proxy and the conversion methods. This is probably not ideal for various reasons.

I do not see any longer a good reason to keep the proxy and these methods within the environment. It seems possible and a good idea to completely detach the environment and the proxy. Some proxies need information from the environment, which is currently set via the call to Env.setup_proxy(), which calls the proxy's setup() method. But this could just be done elsewhere.

Now, in terms of alternatives, I am not completely settled on what the best option would be. In particular, where should the methods that convert between proxy and reward go?

  • In the (base) proxy?
  • In the GFlowNet agent?

alexhernandezgarcia avatar Feb 19 '24 04:02 alexhernandezgarcia