iris
iris copied to clipboard
Trial `iris.common.resolve` in `CubeList.merge()`
resolve is intended to develop into a single-source-of-truth for handling Cube equality, difference and combination. It is currently (2023-07-27) only used for Cube maths.
A big part of CubeList.merge() is analysing/resolving differences between the Cubes to be merged. This is a prime case for instead using resolve.
Benefits
- Bring Iris closer to the single-source-of-truth - easier to discuss, explain, plan future development.
- Enables the spread of Leniency (#4446) to other parts of the code-base in a consistent way - current application in Maths is via
resolve.
Steps to complete
- [ ] Experimentally replace the current resolution code of
merge()with call(s) toiris.common.resolve. - [ ] Run the Iris tests, record failures in this issue
- [ ] Run the Iris benchmarks, record regressions vs
mainbranch in this issue - [ ] Write an on-demand benchmark, modeled on this existing benchmark but with more extreme scaling (100
Cubes?). Run and record that, too! - [ ] Discuss with other core developers (
2023-07-27: we are trialling Dragon Taming meetings).- Does
resolveprevent/enable merges that were previously enabled/prevented? Can we defend these changes? Do we need to adaptresolveto get closer to the existingmerge()behaviour? - Is performance acceptable versus the existing behaviour? Do we need to adapt
resolveto better scale with >2Cubes? - After these concerns, what is left to make resolve-in-merge a reality? Testing changes? Documentation?
- Does
- [ ] Discuss the changes to the API to control lenient and strict behaviour
- [ ] Decide on the default behaviour, with breaking change implications to be considered
- [ ] Write up the agreed next steps as further issues in the tasklist here: #4446. Small, achievable tasks: each task should be complete-able (write and review) in a maximum of 2 weeks.
Most relevant tests
https://github.com/SciTools/iris/blob/3c1b3e6993ba1283b5630ea24ee450d2bd7f246f/lib/iris/tests/test_merge.py#L6-L9
https://github.com/SciTools/iris/blob/3c1b3e6993ba1283b5630ea24ee450d2bd7f246f/lib/iris/tests/integration/merge/test_merge.py#L6-L9
https://github.com/SciTools/iris/blob/3c1b3e6993ba1283b5630ea24ee450d2bd7f246f/lib/iris/tests/unit/cube/test_CubeList.py#L215
Possibly some loading tests too, since merge is used during the loading process.
Most relevant benchmarks
https://github.com/SciTools/iris/blob/68eaa535debcbcc0462416cd23e98e82548711a5/benchmarks/benchmarks/cube.py#L206
Notes from team conversation
- Resolve includes compromises to please specific power-users who wanted it as a tool, to combine with post-fixes. So it often just 'makes things work'. This is inappropriate for merge, because it is used during load, and will magically change things so the cube no longer represents what is in the file
- Example of a compromise: willingly combining cubes with different standard names, which effectively means a different identity
- Resolve is working hard to find commonalities, rather than spotting differences, which may explain the poor scalability
- So resolve is not generalisable, but there is some appetite for finding the generalisable bits of resolve and abstracting them out
- Doing this will be a big project
- Getting it right will need meaningful use cases for both cube arithmetic and merge (and load?)
- We will need a full understanding of the decisions that resolve makes, and the decisions that merge makes, and the flow that they go through
- Need to feed this information back into future dragon taming sessions