kedro
kedro copied to clipboard
Enable dotted keys in params passed to `kedro run`
Description
Development notes
Checklist
- [x] Read the contributing guidelines
- [x] Opened this PR as a 'Draft Pull Request' if it is work-in-progress
- [ ] Updated the documentation to reflect the code changes
- [ ] Added a description of this change in the
RELEASE.md
file - [ ] Added tests to cover my changes
Hmmm, interesting idea. Given that we're planning to switch to OmegaConf
, I would like to understand how this works for them? See https://omegaconf.readthedocs.io/en/2.2_branch/usage.html#from-a-dot-list and https://hydra.cc/docs/advanced/override_grammar/basic/ (won't be possible immediately, but might help understand possible future capabilities better)
If it's something that OmegaConf
doesn't support then we should be very hesitant about adding this feature ourselves. Ideally we just want to delegate all the params overriding logic to them and not modify it ourselves.
Let me just note this down here since I always need to remind myself of the discussion and hunt down the links - https://github.com/quantumblacklabs/private-kedro/issues/965 (beginning of Alternatives & Tradeoffs section) and https://github.com/kedro-org/kedro/issues/399#issuecomment-638087120 for original discussion of .
being used for namespacing params.
After a quick search, it looks like this is not possible in OmegaConf: https://stackoverflow.com/questions/64864518/how-to-escape-the-when-using-dot-notation-commandline-arguments
Closely related, they had a proposal for "compact key support" in https://github.com/omry/omegaconf/issues/152, which was then aborted in https://github.com/omry/omegaconf/issues/332. We should try to align the way we handle parameters with this as closely as possible I think. Possibly this would mean that we no longer encourage a parameter with name a.b
to be accessed through params:a.b
at all? Not sure.
Closely related, they had a proposal for "compact key support" in omry/omegaconf#152, which was then aborted in omry/omegaconf#332. We should try to align the way we handle parameters with this as closely as possible I think. Possibly this would mean that we no longer encourage a parameter with name
a.b
to be accessed throughparams:a.b
at all? Not sure.
Compact keys are a bit different; a compact key has the same semantic meaning as the fully-expanded, nested version, whereas here we're talking about enabling a key name to have dots in it.
https://github.com/omry/omegaconf/issues/152#issuecomment-654332429 is more relevant to the use case that inspired this.
Yeah, understood. I think the problem is that in Kedro we currently sort of enable compact keys.
# nested dictionary
a:
b: value
# compact key
c.d: value2
Here it doesn't actually form the nested dictionary in the compact key case:
In [19]: catalog.load("parameters")
Out[19]: {'a': {'b': 'value'}, 'c.d': 'value2'}
The problem (or benefit, depending on your view) is that both of these will work the same way as a node input, i.e. both params:a.b
and params:c.d
are possible. No need for any quote marks. Since this is the main way that parameters are consumed, I think to the user it's not obvious that we're not fully enabling compact keys here (i.e. we don't make the nested c
dictionary).
In fact, as per https://github.com/quantumblacklabs/private-kedro/issues/965, there's a possible clash here since who knows which takes priority, i.e. if I change c.d
to a.b
above and do catalog.load("params:a.b")
, would you end up with value
or value2
? It's value2
but no particularly good reason for this.
Hence, as things stand in Kedro it's not clear to me a priori what kedro run --params:a.b:value3
should actually do. As you have noticed, it sets things as a nested dictionary, and if we allow your syntax then it does indeed make things less ambiguous so that you can also set the value of a dotted key. But should we then allow params:"a.b"
as well as params:a.b
to resolve the above ambiguity?
Overall, it seems like kind of an edge case which we haven't solved for the params:...
loading syntax. Given we're planning to move to OmegaConf, I would tend to just let them take this lead on this and not do anything custom on the Kedro side. Possibly this would mean that params:c.d
no longer works in future because it would be looking for the c
dictionary.
I agree with @AntonyMilneQB that we should probably hold off on this change and follow the OmegaConf
way of doing this.
@AntonyMilneQB @MerelTheisenQB Sorry for not responding sooner, but I've been thinking about it--isn't this more aligned with OmegaConf (as in, there's a way to avoid fully unrolling parameters, rather than making compact keys, for which there is no OmegaConf equivalent, the default)? I feel like this is a minor change that makes things less ambiguous (as pointed out by Antony) until we move to OmegaConf.
The need basically arises from a use case where was passing Kubernetes metadata as parameters.
@merelcht Just to confirm, this is no longer relevant because of moving to OmegaConf, right? If so, I'll close it.
Sorry for the late reply @deepyaman ! Yes I think we should just leverage OmegaConf
for this kind of stuff and follow their way of doing things. Additionally, I think this problem fits well with the parameters research https://github.com/kedro-org/kedro/issues/2240 Maybe you can add some comments there on what you use parameters for and how?