pyglider
pyglider copied to clipboard
dimensions in gridded data output
A colleague pointed out that the gridded NetCDF file produced by make_gridfiles has depth as the 'x coordinate', and time as the 'y coordinate'.
https://github.com/c-proof/pyglider/blob/c1218acb321f7b2ba9533de8cd6505bddfaeca7e/pyglider/ncprocess.py#L273-L276
It feels more intuitive to have these switched? For instance, in the function itself time is referred to as the xdimname.
Also, this (time as the x dim, depth as the y dim) is how the old socib toolbox writes its gridded data files.
I believe the only change would be changing line 275 to coords={'profile': (xdimname, profiles), 'depth': ('depth', depths)}
I'd not be in favour of this change. This is just a convention and in my mind Depth are rows, and profiles are columns because if you look at it as a spreadsheet, data going down a column is increasing in depth.
I had mostly thought about it from a plotting perspective, where (afaik) time or profile is always on the x-axis, and depth is on the y-axis. But fair enough, and thanks for considering.
One other question/note about gridding and dimensions. Please feel free to move this, or tell me if this should be its own issue.
Currently make_gridfile grids everything except variables identified by the block shown below. This includes variables such as profile_index, profile_direction, waypoint_longitude, and waypoint_latitude, which thus have coordinates of "(depth, time)" rather than only time.
https://github.com/c-proof/pyglider/blob/c1218acb321f7b2ba9533de8cd6505bddfaeca7e/pyglider/ncprocess.py#L322-L324
Trying to catch or classify all variables (eg, waypoint_longitude and waypoint_latitude) that a user could make part of the timeseries dataset seems like too much. However, it does feel like make_gridfiles should handle profile_index and profile_direction, especially since profile_index is required by the function. For instance, there could be a if k in ['profile_index', 'profile_direction']: ... right after line 324. Or, they could be worked into the profile_lookup section beginning in line 305.
If this makes sense to you, happy to submit another pr
I had mostly thought about it from a plotting perspective, where (afaik) time or profile is always on the x-axis, and depth is on the y-axis. But fair enough, and thanks for considering.
Yeah, except I think you have this backwards. At least Matlab and Matplotlib will plot the rows as y-axis and the columns along the xaxis. Eg plt.pcolormesh(ds.temperature.values) looks correct (except it is upside down).
Trying to catch or classify all variables (eg, waypoint_longitude and waypoint_latitude) that a user could make part of the timeseries dataset seems like too much. However, it does feel like make_gridfiles should handle profile_index and profile_direction, especially since profile_index is required by the function. For instance, there could be a if k in ['profile_index', 'profile_direction']: ... right after line 324. Or, they could be worked into the profile_lookup section beginning in line 305.
I think its makes sense that the profile_x variables should be part of the default exclusion.
I think we could easily also make the exclusion list a kwarg or info that gets specified in the yaml somehow. I'm not sure of the details, but I think we could easily accommodate this.
Yeah, except I think you have this backwards. At least Matlab and Matplotlib will plot the rows as y-axis and the columns along the xaxis. Eg
plt.pcolormesh(ds.temperature.values)looks correct (except it is upside down).
I hadn't appreciated that, thanks for explaining.
I think its makes sense that the
profile_xvariables should be part of the default exclusion.I think we could easily also make the exclusion list a kwarg or info that gets specified in the yaml somehow. I'm not sure of the details, but I think we could easily accommodate this.
Sounds good. Unless you have another preference, I'll take a stab at a pr sometime next week