Jeff Whitaker

Results 532 comments of Jeff Whitaker

If this slice is small, the overhead of the python interface (creating the start,count,stride arrays to pass to the C library) will be large relative to the cost of actually...

Could you post your actual benchmark code (including the data file)?

The reason the netcdf4 results are so slow has to be because of chunking: http://www.unidata.ucar.edu/blogs/developer/entry/chunking_data_why_it_matters

If all the strides are 1, `nc_get_vara` is called, if not `nc_get_vars` is called. In your example, there is only one call to either - it's just that the `nc_get_vars`...

Some more information about the C lib calls: in the case when all the strides are 1 (nc_get_vara called) the start,count,stride arrays are [0 0 0 0] [ 3 10...

This may be related http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2013/msg00311.html

Are you saying the python interface should not try to use `nc_get_vars` for strided reads?

I'd be inclined to leave it to client code to do something like this data=nf['var'][:][::2] instead of data=nf['var'][::2] Perhaps we could issue a warning when nc_get_vars is called with NETCDF4...

nc_get_vara can only be used when the strides are all 1. If the strides are not 1, then we would have to add extra code to subset the returned data....

I think the reason your suggested change speeds things up is that it is (incorrectly) calling nc_get_vara when it should be calling nc_get_vars. You'll see that when you run the...