datatree icon indicating copy to clipboard operation
datatree copied to clipboard

Setting node name keeps tree linkage

Open etienneschalk opened this issue 1 year ago • 3 comments

Small bugfix. I added a test reproducing the example in #309, and tests are not broken

Note: In _post_attach, I set the private _name instead of setting the name. Indeed, it can lead to infinite recursions when a setter is used inside of a class. I assume the _post_attach method is like a "runtime assertion"

  • [x] Closes #309
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [ ] ~~New functions/methods are listed in api.rst~~
  • [x] Changes are summarized in docs/source/whats-new.rst

etienneschalk avatar Feb 08 '24 21:02 etienneschalk

I remained conservative and forbid the renaming of a node to None if it has a parent

Edit: according to a comment in the datatree design document mentioned in https://github.com/pydata/xarray/issues/8747, section 4) Are nodes named? in practice, only the root node would remain anonmyous. Hence it makes sense to only authorize a node without a parent (root) to be renamed to None?

See my comment https://github.com/xarray-contrib/datatree/issues/309#issuecomment-1949940406

Edit: build is failing because of the new change in printing sizes of DataArrays: https://github.com/pydata/xarray/pull/8702/

Whether build or dev-build pass, they are mutually exclusive. What is the one that should be favoured? Current version of xarray of next version?

etienneschalk avatar Feb 17 '24 11:02 etienneschalk

I see I got to this pretty late, so it may be solved already (if so, please ignore this). But I couldn't quite tell from this conversation if everything was working or not. If not, I have a solution which passes all pytests (except for a format error that was not passing prior to my changes) at https://github.com/marcel-goldschen-ohm/datatree/tree/main. Changes are in datatree.py in name.setter and _attach methods, and I added a test_namechange method to test_datatree.py. Again, if this is already solved, please ignore.

marcel-goldschen-ohm avatar Mar 01 '24 19:03 marcel-goldschen-ohm

Any news on this? It would be great if this could finally be incorporated into the PyPI distribution of datatree.

marcel-goldschen-ohm avatar Jun 18 '24 21:06 marcel-goldschen-ohm

I know everyone is busy, so I hate to harp on this. But I'm surprised that even after 7 months this very fundamental issue with what seems like should be a simple solution is still not resolved (as far as I can tell). I'm not sure what the status is or what the issue is regarding unsuccessful checks for @etienneschalk's solution. I also have a solution that works for me at https://github.com/marcel-goldschen-ohm/datatree/tree/main as previously mentioned. How can I help get this resolved?

marcel-goldschen-ohm avatar Sep 07 '24 18:09 marcel-goldschen-ohm

Sorry @marcel-goldschen-ohm for letting this slip! I understand it's frustrating that we didn't just merge this and release.

For context we've been concentrating on trying to get datatree merged upstream into xarray (see https://github.com/pydata/xarray/issues/8572#issuecomment-2218020742), which would allow us to completely deprecate and archive this repository. We were hoping to have that done in July. 😅 This process is necessary, but it's an awkward time because it's tricky to keep track of both repos at once, especially as now some bugs in one are fixed in the other...

We've made various small changes during that migration, including to whether or not copies of trees are created during operations, with the aim of making everything more consistent with the rest of xarray (see for example https://github.com/pydata/xarray/pull/9297). We've also been trying to close issues here in favour of raising them upstream in xarray (because annoyingly we can't just auto-transfer all these issues to xarray upstream as it's in a different github org).


However I've been looking at this specific issue again and I don't think it's trivial. After @etienneschalk 's comment in https://github.com/xarray-contrib/datatree/issues/309#issuecomment-1949940406 it is now genuinely not clear to me what the desired behaviour should be. I have raised an upstream issue on xarray to track that question of what the behaviour should be going forward (https://github.com/pydata/xarray/issues/9447).

I am happy to merge this PR here and release another version if that would really help you, but my main concern has been ensuring that the upstream (non-prototype) version of datatree makes the right decisions going forward, not making sure all bugs are fixed in this version that's about to be deprecated.

TomNicholas avatar Sep 07 '24 22:09 TomNicholas

Regarding @etienneschalk 's comment in https://github.com/xarray-contrib/datatree/issues/309#issuecomment-1949940406 , my personal opinion is that xds['a'].name = 'b' should behave the same as xda = xds['a']; xda.name = 'b'. Anything else is highly nonintuitive from an end user experience if Xarray is going to be widely adopted (my opinion). At the very least, it needs to throw an error unless the idea is to give everyone a headache, and provide a method for renaming in place. Of course, this is an Xarray issue. If the goal of DataTree is to match this behavior, then I would strongly suggest against it (I would also suggest it should be changed in Xarray). At the very least, there must be a method that allows renaming a DataTree instance without having to create a new instance. Without such a method, developing UIX apps to work with DataTrees can become much more painful as references to DataTree nodes tend to be shared and stored in the UIX, and thus a pain to have to recreate for a simple renaming operation.

All of that said, I hear you that you have a lot on your plate with the move to Xarray. Also, although the desired behavior is not unclear to me, I hear you that Xarray itself apparently has some pretty silly behavior in this regard, and I get wanting to be Xarray compatible although I disagree with it in this case. Now that I know what the issue is, I can find a way to work around it, so don't worry about this for my benefit. Hopefully in the future both Xarray and DataTree will be more intuitive in this regard. Best wishes on the merge with Xarray! And also a big thank you for DataTree in the first place, I couldn't agree more that a tree structure of Datasets is highly useful :)

marcel-goldschen-ohm avatar Sep 08 '24 16:09 marcel-goldschen-ohm

Thanks @marcel-goldschen-ohm for your (very useful) perspective.

Now that I know what the issue is, I can find a way to work around it, so don't worry about this for my benefit.

In that case I'm closing this in favour of the discussion upstream in https://github.com/pydata/xarray/issues/9447.

TomNicholas avatar Sep 09 '24 18:09 TomNicholas