Creating a DataTree should not modify parent and children in-place
What is your issue?
This violates the well-established Python convention that functions either have a return value or modify arguments in-place. They never do both.
Examples:
- Modifying a parent:
from xarray.core.datatree import DataTree
root = DataTree()
child = DataTree(name='child', parent=root)
print(root)
# <xarray.DataTree>
# Group: /
# └── Group: /child
- Modifying children:
from xarray.core.datatree import DataTree
child = DataTree()
root = DataTree(children={'child': child})
print(child)
# <xarray.DataTree 'child'>
# Group: /child
This particularly surprising if a DataTree argument is reused, e.g.,
from xarray.core.datatree import DataTree
child = DataTree()
root = DataTree(children={'child': child})
root2 = DataTree(children={'child2': child})
print(child) # attached to root2
# <xarray.DataTree 'child2'>
# Group: /child2
print(root) # now empty!
# <xarray.DataTree>
# Group: /
Here's my suggestion:
- We should make the
DataTreeconstructor make shallow copies of itsDataTreearguments. - We should consider getting rid of the
parentconstructor argument. It is redundant withchildren, and unlike children, parent nodes don't show up in theDataTreerepr, so it's more surprising to see them copied.
CC @TomNicholas
Good point, this wasn't really an intentional breaking of that convention on my part.
Happy with either / both of your suggestions.
I think I originally put parent in the constructor just so that all attributes were present in the signature, but there are multiple other ways to set the parent once the subtree has been created.
To be clear the aim here is to do both of @shoyer 's suggestions, probably in 2 separate PRs
I'm just finding this issue now. I Don't know why I didn't see it. I assume, this is going to make a bunch of changes required in the docs. Can you summarize what I'll need to look for? e.g I assume this isn't true any longer.
homer = xr.DataTree(name="Homer", children={"Bart": bart, "Lisa": lisa})
The nodes representing Bart and Lisa are now connected - we can confirm their sibling rivalry by examining the [siblings](https://github.com/pydata/xarray/generated/xarray.DataTree.siblings.html#xarray.DataTree.siblings) property:
list(bart.siblings)
Out[5]: ['Lisa']
That assignment to homer will no longer make bart's sibling lisa? What am I missing?
e.g I assume this isn't true any longer.
homer = xr.DataTree(name="Homer", children={"Bart": bart, "Lisa": lisa}) The nodes representing Bart and Lisa are now connected - we can confirm their sibling rivalry by examining the [siblings](https://github.com/pydata/xarray/generated/xarray.DataTree.siblings.html#xarray.DataTree.siblings) property: list(bart.siblings) Out[5]: ['Lisa']That assignment to homer will no longer make bart's sibling lisa? What am I missing?
Yes, you have this correct.
homer.children['Bart'] will have the sibling Lisa, but not the original bart object.
This should be closed by #9297.