kedro icon indicating copy to clipboard operation
kedro copied to clipboard

Enable dotted keys in params passed to `kedro run`

Open deepyaman opened this issue 1 year ago • 6 comments

Description

Development notes

Checklist

  • [x] Read the contributing guidelines
  • [x] Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • [ ] Updated the documentation to reflect the code changes
  • [ ] Added a description of this change in the RELEASE.md file
  • [ ] Added tests to cover my changes

deepyaman avatar Aug 08 '22 00:08 deepyaman

Hmmm, interesting idea. Given that we're planning to switch to OmegaConf, I would like to understand how this works for them? See https://omegaconf.readthedocs.io/en/2.2_branch/usage.html#from-a-dot-list and https://hydra.cc/docs/advanced/override_grammar/basic/ (won't be possible immediately, but might help understand possible future capabilities better)

If it's something that OmegaConf doesn't support then we should be very hesitant about adding this feature ourselves. Ideally we just want to delegate all the params overriding logic to them and not modify it ourselves.

Let me just note this down here since I always need to remind myself of the discussion and hunt down the links - https://github.com/quantumblacklabs/private-kedro/issues/965 (beginning of Alternatives & Tradeoffs section) and https://github.com/kedro-org/kedro/issues/399#issuecomment-638087120 for original discussion of . being used for namespacing params.

antonymilne avatar Aug 08 '22 10:08 antonymilne

After a quick search, it looks like this is not possible in OmegaConf: https://stackoverflow.com/questions/64864518/how-to-escape-the-when-using-dot-notation-commandline-arguments

Closely related, they had a proposal for "compact key support" in https://github.com/omry/omegaconf/issues/152, which was then aborted in https://github.com/omry/omegaconf/issues/332. We should try to align the way we handle parameters with this as closely as possible I think. Possibly this would mean that we no longer encourage a parameter with name a.b to be accessed through params:a.b at all? Not sure.

antonymilne avatar Aug 08 '22 10:08 antonymilne

Closely related, they had a proposal for "compact key support" in omry/omegaconf#152, which was then aborted in omry/omegaconf#332. We should try to align the way we handle parameters with this as closely as possible I think. Possibly this would mean that we no longer encourage a parameter with name a.b to be accessed through params:a.b at all? Not sure.

Compact keys are a bit different; a compact key has the same semantic meaning as the fully-expanded, nested version, whereas here we're talking about enabling a key name to have dots in it.

https://github.com/omry/omegaconf/issues/152#issuecomment-654332429 is more relevant to the use case that inspired this.

deepyaman avatar Aug 08 '22 12:08 deepyaman

Yeah, understood. I think the problem is that in Kedro we currently sort of enable compact keys.

# nested dictionary
a:
  b: value

# compact key
c.d: value2

Here it doesn't actually form the nested dictionary in the compact key case:

In [19]: catalog.load("parameters")
Out[19]: {'a': {'b': 'value'}, 'c.d': 'value2'}

The problem (or benefit, depending on your view) is that both of these will work the same way as a node input, i.e. both params:a.b and params:c.d are possible. No need for any quote marks. Since this is the main way that parameters are consumed, I think to the user it's not obvious that we're not fully enabling compact keys here (i.e. we don't make the nested c dictionary).

In fact, as per https://github.com/quantumblacklabs/private-kedro/issues/965, there's a possible clash here since who knows which takes priority, i.e. if I change c.d to a.b above and do catalog.load("params:a.b"), would you end up with value or value2? It's value2 but no particularly good reason for this.

Hence, as things stand in Kedro it's not clear to me a priori what kedro run --params:a.b:value3 should actually do. As you have noticed, it sets things as a nested dictionary, and if we allow your syntax then it does indeed make things less ambiguous so that you can also set the value of a dotted key. But should we then allow params:"a.b" as well as params:a.b to resolve the above ambiguity?

Overall, it seems like kind of an edge case which we haven't solved for the params:... loading syntax. Given we're planning to move to OmegaConf, I would tend to just let them take this lead on this and not do anything custom on the Kedro side. Possibly this would mean that params:c.d no longer works in future because it would be looking for the c dictionary.

antonymilne avatar Aug 08 '22 14:08 antonymilne

I agree with @AntonyMilneQB that we should probably hold off on this change and follow the OmegaConf way of doing this.

merelcht avatar Aug 08 '22 15:08 merelcht

@AntonyMilneQB @MerelTheisenQB Sorry for not responding sooner, but I've been thinking about it--isn't this more aligned with OmegaConf (as in, there's a way to avoid fully unrolling parameters, rather than making compact keys, for which there is no OmegaConf equivalent, the default)? I feel like this is a minor change that makes things less ambiguous (as pointed out by Antony) until we move to OmegaConf.

The need basically arises from a use case where was passing Kubernetes metadata as parameters.

deepyaman avatar Sep 16 '22 10:09 deepyaman

@merelcht Just to confirm, this is no longer relevant because of moving to OmegaConf, right? If so, I'll close it.

deepyaman avatar Feb 06 '23 13:02 deepyaman

Sorry for the late reply @deepyaman ! Yes I think we should just leverage OmegaConf for this kind of stuff and follow their way of doing things. Additionally, I think this problem fits well with the parameters research https://github.com/kedro-org/kedro/issues/2240 Maybe you can add some comments there on what you use parameters for and how?

merelcht avatar Feb 20 '23 12:02 merelcht