dimarray Applying .copy() on a Dataset leads to counterintuitve effects

When creating and copying a dataset, changing one dataset also changes the other dataset:

In [70]: dataset1 = da.Dataset({'test':da.DimArray(np.zeros([4,3]))})
In [71]: dataset2 = dataset1.copy()
In [72]: dataset1['test'][2,2] = -99
In [73]: dataset2['test'][2,2]
Out[73]: -99.0

This isn't the case when a dimarray (or numpy.array) is copied:

In [65]: data1 = da.DimArray(np.zeros([4,3]))
In [67]: data2 = data1.copy()
In [68]: data1[2,2] = 9
In [69]: data2[2,2]
Out[69]: 0.0

It would be great if this .copy() on a dataset would have the same effects that .copy() usually has. Otherwise, I would recommend to implement an error or warning message. Thanks a lot

Feb 14 '19 15:02 peterpeterp

Hi @peterpeterp, thanks for your comment. For now the Dataset.copy method inherits from a dictionary, while the DimArray.copy method is similar to an array. It follows a certain idea about python containers, like a dict or a list (see for instance https://docs.python.org/3.7/library/copy.html), but I agree this can be confusing, because the Dataset object is a bit of both (it does support certain operations in an array-like fashion). In any case, for now, please consider using python's copy.deepcopy method to copy the arrays as well.

Apr 09 '19 17:04 perrette

By the way, it's funny you picked this example, in the unit tests I had:

ds2 = ds.copy()
ds2['aa']['b',22.] = -99
assert np.all(ds['aa'] == ds2['aa']) # shallow copy ==> elements are just references

Apr 09 '19 17:04 perrette