opendrift icon indicating copy to clipboard operation
opendrift copied to clipboard

Unstructured netcdf reader node variables not selected appropriately

Open Boorhin opened this issue 3 years ago • 5 comments
trafficstars

https://github.com/OpenDrift/opendrift/blob/0711551c867b5b816fe81da0a61d7a75b1a6b57f/opendrift/readers/reader_netCDF_CF_unstructured.py#L304

variables[var] = dvar[slice(nodes.min(), nodes.max() + 1)]

Should be

variables[var] = dvar[nodes]

Boorhin avatar Dec 14 '21 14:12 Boorhin

I guess there was a reason why slice was used instead of simply nodes (and likewise for fcsand sigma_ind). Probably since many particles may have the same nearest node - and then it is a waste to read/return the same value many times, instead of just reading the minimum encompassing block, and afterwards interpolating.

Perhaps @gauteh can have a look tomorrow.

knutfrode avatar Dec 14 '21 15:12 knutfrode

if you seed 2 points far away from each other, you will slice all the index in between. repetition of index is not an issue in numpy and even less in xarray (that's how I spotted this while developing a reader)

Boorhin avatar Dec 14 '21 16:12 Boorhin

if you seed 2 points far away from each other, you will slice all the index in between. repetition of index is not an issue in numpy and even less in xarray (that's how I spotted this while developing a reader)

The problem is that netcdf will generate multiple requests when indexing indices, rather than slices, so it becomes very inefficient. If you look through the history of that file you will see that we experimented with both. It does depend on the situation though, and what you describe is one situation where it is a worse choice. But maybe that is not a very common situation?

gauteh avatar Dec 14 '21 17:12 gauteh

There must a mechanism I don't understand. I use that at the moment on a quite large mesh with xarray and I seem to have good performances, in the former selafin reader it was okish also (although the reader was very basic). We use multipoint generation in coastal areas and I don't see the indexing being constant in a regional way especially that we have to locally improve the quality of the mesh to manage river inputs and tidal flats. Maybe in the open ocean it is ok. Just wanted to point it out as it confused me. The only acceleration that I am conceiving is to take concomitant bytes and thus sort the node array and recover the inverse of the sorting to throw the variables back in the model. But that's in the xarray lib I think

Boorhin avatar Dec 14 '21 17:12 Boorhin

Just to show how one of our mesh is built. This is the connectivity information of the mesh. I am reducing the dataset to a netcdf Ugrid compatible with only 100 nodes, so I limit the number of triangles that are fully populated by nodes of index 0-99. You can see the kind of issues you can run into by considering the indexing linear with comparison the coordinate system.

ds['face_nodes'].where(ds['face_nodes']<100).dropna(dim='face',how='any').values
Out[43]: 
array([[ 0.,  1., 33.],
       [ 0., 33., 36.],
       [ 0., 36.,  2.],
       [ 1.,  5., 34.],
       [ 1., 34., 33.],
       [ 2., 36., 39.],
       [ 2., 39.,  3.],
       [ 3., 39., 35.],
       [ 3., 35.,  4.],
       [ 4., 35., 38.],
       [ 4., 38.,  6.],
       [ 5., 10., 41.],
       [ 5., 41., 34.],
       [ 6., 38., 43.],
       [ 6., 43.,  7.],
       [ 7., 43., 47.],
       [ 7., 47.,  8.],
       [ 8., 47., 44.],
       [ 8., 44.,  9.],
       [ 9., 44., 45.],
       [ 9., 45., 11.],
       [10., 16., 53.],
       [10., 53., 41.],
       [11., 45., 48.],
       [11., 48., 12.],
       [12., 48., 50.],
       [12., 50., 13.],
       [13., 50., 54.],
       [13., 54., 14.],
       [14., 54., 57.],
       [14., 57., 15.],
       [15., 57., 62.],
       [15., 62., 17.],
       [16., 20., 58.],
       [16., 58., 53.],
       [17., 62., 56.],
       [17., 56., 18.],
       [18., 56., 60.],
       [18., 60., 19.],
       [19., 60., 61.],
       [19., 61., 21.],
       [20., 25., 63.],
       [20., 63., 58.],
       [21., 61., 65.],
       [21., 65., 22.],
       [22., 65., 67.],
       [22., 67., 23.],
       [23., 67., 70.],
       [23., 70., 24.],
       [24., 70., 68.],
       [24., 68., 26.],
       [25., 31., 69.],
       [25., 69., 63.],
       [26., 68., 72.],
       [26., 72., 27.],
       [27., 72., 73.],
       [27., 73., 28.],
       [28., 73., 74.],
       [28., 74., 29.],
       [29., 74., 78.],
       [29., 78., 30.],
       [30., 78., 77.],
       [30., 77., 32.],
       [31., 52., 87.],
       [31., 87., 69.],
       [32., 77., 86.],
       [32., 86., 37.],
       [33., 34., 79.],
       [33., 79., 80.],
       [33., 80., 36.],
       [34., 41., 84.],
       [34., 84., 79.],
       [35., 39., 82.],
       [35., 82., 81.],
       [35., 81., 38.],
       [36., 80., 83.],
       [36., 83., 39.],
       [37., 86., 90.],
       [37., 90., 40.],
       [38., 81., 88.],
       [38., 88., 43.],
       [39., 83., 82.],
       [40., 90., 95.],
       [40., 95., 42.],
       [41., 53., 84.],
       [42., 95., 89.],
       [42., 89., 46.],
       [43., 88., 94.],
       [43., 94., 47.],
       [44., 47., 91.],
       [44., 91., 92.],
       [44., 92., 45.],
       [45., 92., 93.],
       [45., 93., 48.],
       [46., 89., 96.],
       [46., 96., 49.],
       [47., 94., 91.],
       [48., 93., 97.],
       [48., 97., 50.],
       [49., 96., 99.],
       [49., 99., 51.]])

Boorhin avatar Dec 15 '21 09:12 Boorhin