xMIP icon indicating copy to clipboard operation
xMIP copied to clipboard

Merging datasets with overlapping but non identical coordinates

Open Recalculate opened this issue 3 years ago • 4 comments

I have noticed several models where various important variables (eg surface heat flux and ocean temperature) for various runs don't have identical time coordinates. These inevitably overlap significantly, just missing a few years at one end or the other, and it would be ideal when running merge_variables if they would just output the merged dataset where all coordinates overlap. Presently it just produces an empty dataset, and I cannot see an obvious way of turning on this functionality beyond manually trimming the datasets prior to merging. Does this functionality exist, if if not, could it be implemented at some point in the future?

Recalculate avatar Jun 08 '22 15:06 Recalculate

You can pass keyword arguments to xarray.merge via the merge_kwargs={} argument. We currently use the default of join='exact', which you could try to relax to merge_variables(..., merge_kwargs={'join':'inner'}). Let me know if that helps

jbusecke avatar Jun 08 '22 20:06 jbusecke

Brilliant, that seems to have done the trick. Might be worth updating some of the text in the guide or API to reflect this, as it is probably a pretty common problem. Though perhaps it is also a common solution and I'm just a bit ignorant!

Recalculate avatar Jun 09 '22 09:06 Recalculate

No I think that would be a very good addition. Are you able to provide the 'instance_id' (found in ds.attrs for each dataset) for two example datasets by any chance? That would make it easy to add an example to the docs.

jbusecke avatar Jun 09 '22 13:06 jbusecke

'instance_id' doesn't seem to be in their attributes, but one example would be trying to merge the piControl thetao (ocean temp) and hfds (surf total heat flux) for the UKESM1-0-LL. That is: 'CMIP.MOHC.UKESM1-0-LL.piControl.r1i1p1f2.Omon.thetao.gn.gs://cmip6/CMIP6/CMIP/MOHC/UKESM1-0-LL/piControl/r1i1p1f2/Omon/thetao/gn/v20190827/.nan.20190827' and 'CMIP.MOHC.UKESM1-0-LL.piControl.r1i1p1f2.Omon.hfds.gn.gs://cmip6/CMIP6/CMIP/MOHC/UKESM1-0-LL/piControl/r1i1p1f2/Omon/hfds/gn/v20200828/.nan.20200828'

Recalculate avatar Jun 09 '22 14:06 Recalculate