xbatcher
xbatcher copied to clipboard
Dimension name change with `concat_input_dims` is a side effect
What is your issue?
Title. The problem is that changing dimension names makes it difficult for the user to index into batched arrays in a batch loop. This is particularly annoying because changing the value of concat_input_dims
will change this behavior, sometimes appending _input
, sometimes not, which makes debugging and experimentation difficult. I view this as an unwelcome side effect, and I'd prefer the non-batched dimensions keep their original names.
Hey @maxrjones, does this serve any purpose? It's incredibly annoying to get through batch generation only to crash because I forgot to rename the dimensions I'm subsetting.
Partial example:
bgen = xb.BatchGenerator(
ds,
{'nlon':nlons, 'nlat':nlats},
concat_input_dims=True
)
sub = {'nlon':range(halo_size,nlons-halo_size),
'nlat':range(halo_size,nlats-halo_size)}
for batch in bgen:
batch_input = [batch[x][sub] for x in ['SSH', 'SST']]
This will crash because the names of batch_input
's dimensions are now nlon_input
and nlat_input
, but if concat_input_dims=False
the dim names stay the same.
Just learned that xarray rolling adds "_input" (or something similar) also, and it's used to distinguish between the original dimensions (which may still exist) and the new stencil dims.
I'm thinking that this looks superfluous in xbatcher because (at least in my case) the original dimensions are always stacked. Maybe "_input" makes sense if they aren't stacked?