nwb-conversion-tools icon indicating copy to clipboard operation
nwb-conversion-tools copied to clipboard

Refactor `dict_deep_update` function.

Open h-mayorquin opened this issue 2 years ago • 1 comments

While working on PR #481 me and @CodyCBakerPhD were discussing about the poor variable name in this very important function. Right now, we have:

https://github.com/catalystneuro/nwb-conversion-tools/blob/5e66722723420856de70e80b8e465f3bb20ebfbd/src/nwb_conversion_tools/utils/dict.py#L121-L124

Which go against the design principle that variable names should be as meaningful as possible so they work as documentation themselves.

Right now PR #481 has modified the internal names of the function but we are leaving the keywords arguments untouched as that might break some script that rely on this. I am writing this issue as a record of intent for modifying this in the future, propose some break down of the tasks to do, write some other concerns that I have and general discussion.

Proposal for to-do:

  • [ ] Do a PR changing the keyword arguments but also adding logic so the previous keyword arguments still work but throw a deprecation warning.
  • [ ] The auxiliary function append_replace_dict_in_list suffers from the sames issues and should be refactored as well.
  • [ ] Moreover, the append_replace_dict_in_list seems to be actually doing more than one thing, so it should be break down for the different actions that it actually performs.
  • [ ] Modify the unit_tests so they reflect some use cases with metadata or source_schemas so they work better as functional documentation.

I am also in general concerned about all the functionality that we are adding to the function. We have many parameters that change the behavior of the function:

  1. list_dict_deep_update (A boolean) for example, controls whether lists are appended/extended or just overwritten.
  2. compare_key (a string that equals "name") controls where properties in our metadata have to match so they are updated.

I checked in our library and as far as I could find we don't really use the function without the defaults. My impression is that we are keeping this for backwards compatibility. Given that the inclusion of this parameters account for a large chunk of the complexity of this function and makes a central part of our code base more difficult to understand and modify I think we should also start thinking on dropping them (with the usual warning period of course).

Am I missing something?

h-mayorquin avatar Apr 26 '22 17:04 h-mayorquin