iris icon indicating copy to clipboard operation
iris copied to clipboard

[ENH] Concatenate usability improvements

Open DPeterK opened this issue 5 years ago • 5 comments

Improve the usability of concatenate by:

  • allowing concatenate to operate on cubes with differing data / coordinate points / coordinate bounds dtypes
  • producing more descriptive error messages on failing to concatenate to a single cube.

Here's an example of some of the improved error messages:

cubes.concatenate_cube()
Traceback (most recent call last):
  ...
iris.exceptions.ConcatenateError: failed to concatenate into a single cube.
  Dimension coordinate metadata differs: name (latitude, longitude, time != longitude)
  Auxiliary coordinate metadata differs: units (Unit('1') != Unit('m'))
  Data types differ: float32 != <U32
  Scalar coordinate metadata differs: name (height != None)
  Data dimensions differ: 3 != 1
  Scalar coordinate metadata differs: long_name (None != 'alice')

DPeterK avatar Dec 13 '18 16:12 DPeterK

@DPeterK Did you fancy rebasing? We recently done a merge back of 2.2.x to master, which may resolve your current travis-ci issues...

bjlittle avatar Jun 05 '19 10:06 bjlittle

@bjlittle done!

DPeterK avatar Jun 24 '19 12:06 DPeterK

RE: "I've checked the before + afters on the error messages for the testcases in TestMessages, and it's not all an improvement" ...

I hacked the test code to report all the actual error messages and ran new code against old. The results, with my ratings ...

 FAIL: test_aux_coords_difference_message (__main__.TestMessages)
-  Auxiliary coordinates differ: foo, wibble != wibble"
+  Auxiliary coordinate metadata differs: name (foo, wibble != wibble)"

Verdict :-1: "metadata" seems redundant and unhelpful : I think we should only use the word when it is the metadata of the coords which differs (e.g. attributes or units) , rather than the set of coords themselves. Also, "name" seems positively unhelpful : What is actually different is the list of aux coords (not really their names) .. see below for an idea about this ...


 FAIL: test_aux_coords_metadata_difference_message (__main__.TestMessages)
-  Auxiliary coordinates metadata differ: foo != foo"
+  Auxiliary coordinate metadata differs: 'foo' units (Unit('1') != Unit('m'))"

Verdict : :+1: definitely better :bouquet: !!


 FAIL: test_dimensions_difference_message (__main__.TestMessages)
-  Dimension coordinates differ: latitude, longitude, time != longitude, time"
+  Dimension coordinate metadata differs: name (latitude, longitude, time != longitude, time)"

Verdict : :-1: as for "test_aux_coords_difference_message" above, dislike both "metadata" and "name"


 FAIL: test_dimensions_metadata_difference_message (__main__.TestMessages)
-  Dimension coordinates metadata differ: latitude != latitude"
+  Dimension coordinate metadata differs: 'latitude' long_name (None != 'bob')"

Verdict : :+1:


 FAIL: test_dims_covered_difference_message (__main__.TestMessages)
-  Auxiliary coordinates metadata differ: wibble != wibble"
+  Auxiliary coordinate metadata differs: 'wibble' dims ((0, 1, 2) != (0, 1))"

Verdict : :+1:


 FAIL: test_ndim_difference_message (__main__.TestMessages)
-  Dimension coordinates differ: latitude, longitude, time != longitude
-  Auxiliary coordinates differ: foo, wibble != < None >
-  Scalar coordinates differ: height != < None >
+  Dimension coordinate metadata differs: name (latitude, longitude, time != longitude)
+  Auxiliary coordinate metadata differs: name (foo, wibble != None)
+  Scalar coordinate metadata differs: name (height != None)
   Data dimensions differ: 3 != 1"

Verdict : :-1: Same stuff again re 'metadata' and 'name'.


 FAIL: test_scalar_coords_difference_message (__main__.TestMessages)
-  Scalar coordinates differ: height != < None >"
+  Scalar coordinate metadata differs: name (height != None)"

Verdict : :-1:


 FAIL: test_scalar_coords_metadata_difference_message (__main__.TestMessages)
-  Scalar coordinates metadata differ: height != height"
+  Scalar coordinate metadata differs: 'height' long_name (None != 'alice')"

Verdict : :+1:



A Thought: The "name" term appears to come from a single common line in the code I think in that case, it could be taken from the 'attr' keyword to replace 'name' with a more specific useful term, e.g. with the mapping

type_context = {
    'dim_metadata' : 'list of dimension coordinates',
    'aux_metadata':  'list of auxiliary coordinates',
    'scale_metadata':  'list of scalar coordinates'}[attr]

( Well, maybe something similar, anyway )

pp-mo avatar Aug 28 '19 14:08 pp-mo

@DPeterK What do you think ? Hope this is not discouraging, there is a lot of good waiting to come out of this one !

pp-mo avatar Aug 28 '19 14:08 pp-mo

@pp-mo thanks for the thorough review! Not discouraging; this is a hard problem to solve and it would be good to get it right, or at least better than currently.

it's a shame we've combined it with the more "cosmetic" changes

Yes. I think I started one thing (the dtype support, if memory serves) and it flowed naturally into making the cosmetic changes and I was lazy and kept going in the same PR rather than starting afresh...

The "name" term appears to come from a single common line in the code

It's a long time ago that I implemented this now, but I think that the reason I adjusted the error message in the way I did is that I could use just one template (good on account of consistency and simplicity) that, I felt, improved the error message in the majority of use-cases. That said, having a mapping that expands name as per the suggestion may help solve this problem.

DPeterK avatar Sep 02 '19 10:09 DPeterK

@SciTools/peloton unfortunately it does look like this is effectively dead now, not least because the whole way we do metadata (and what that word means) has since changed. Closing for now -- please reopen @DPeterK if you still think we can move this forward.

pp-mo avatar Jun 28 '23 09:06 pp-mo