omegaconf icon indicating copy to clipboard operation
omegaconf copied to clipboard

Consistent resolving of interpolations to random resolver values

Open odelalleau opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe.

I have a use case where I want to sample many different configs from a "template" config where some values are obtained by custom resolvers generating random values. Something like:

base_cfg = OmegaConf.create({"foo": "${uniform:0,1}", "bar": "${uniform:0, 1}"})
for i in range(10):
    sampled_cfg = copy.deepcopy(base_cfg)
    OmegaConf.resolve(sampled_cfg)
    do_something_with(sampled_cfg)

This works well, except that a problem arises if I want for instance to instead set "bar": "${foo}", i.e., I would like bar to be equal to foo in my sampled config. This won't actually be the case because both foo and bar are resolved independently, and thus when resolving bar we sample a new value for foo.

Describe the solution you'd like

I think multiple solutions are possible, including:

  • resolve() (and other similar operations resolving the config like to_object()) could ensure that if a variable x is resolved to some value at any step during the resolution, then any interpolation to x re-uses the same value (instead of re-computing it)
  • Or they could have a new flag to introduce this behavior if we don't want to change the current one
  • We could introduce a new cache mechanism through a flag which, when set, would make it so that a config "remembers" the value of its keys and does not try to resolve them again on access. This could be more generaly useful to speed up access of interpolated fields (we could completely skip the parsing step)
  • The resolver cache mechanism could be extended to allow a "node-level" cache instead of the current "config-level" (read below for details)

Describe alternatives you've considered

My current workaround is instead to hardcode this logic in the code instead of having it in the config (i.e., set sampled_cfg.bar = sampled_cfg.foo).

Another solution would be to enable the resolver cache for uniform. I think this would be a viable approach to solve this problem if there was an easy way to make the cache node-based. The current cache mechanics make sampling many independent random values a bit cumbersome (for instance for indepdendent samples I would need to use something like {"foo": "${uniform:0,1,foo}", "bar": "${uniform:0, 1, bar}"}, i.e., add a dummy input to my resolver, to be able to sample different values for foo and bar).

odelalleau avatar May 20 '21 22:05 odelalleau

The best solution is probably to enable node level cache somehow, however I think it's premature to open up the resolver caching API. I would like any change in that direction to come with a design document that attempts to provide a complete solution to the problem space and not something that patches a specific use case. A few things to think about:

  • Are there any other caching strategies besides config level caching and node level caching we should think about?
  • Do we want to allow caching strategy to be controlled on a per resolver basis or a per node basis?

By the way: At this point I am not taking any new large feature requests for 2.1, we already have plenty of high priority things there and I did not yet evaluate the existing backlog.

omry avatar May 20 '21 23:05 omry

At this point I am not taking any new large feature requests for 2.1

Oh, no worries, I didn't intend to push for it in 2.1. What you said makes sense.

odelalleau avatar May 21 '21 02:05 odelalleau

At this point I am not taking any new large feature requests for 2.1

Oh, no worries, I didn't intend to push for it in 2.1. What you said makes sense.

Sorry, I meant 2.2. We didn't even start planning it yet and there is already a big backlog. Some of the things I am planning is better support for containers and union support, both pretty big.

omry avatar May 21 '21 02:05 omry