iris unpin numpy legacy printing

Did we want to unpin the legacy=1.13 printing of numpy arrays for the 2.1 release of iris since we've worked hard to have numpy>=1.14 ?

Or do this in the following point release?

I just didn't want us to squirrel this one away and forget about it before it hurts us (again)...

Jun 02 '18 07:06 bjlittle

No need to do this for v2.1, but I'm definitely keen to do it asap.

Jun 02 '18 08:06 pelson

Without meaning to be blase, but...

~~No need to do this for v2.1, but I'm definitely keen to do it asap.~~

No need to do this for v2.2, but I'm definitely keen to do it asap.

Oct 03 '18 09:10 DPeterK

Without meaning to be blase, but...

~~No need to do this for v2.1, but I'm definitely keen to do it asap.~~

~~No need to do this for v2.2, but I'm definitely keen to do it asap.~~

~~No need to do this for v2.3, but I'm definitely keen to do it asap.~~

~~No need to do this for v2.4, but I'm definitely keen to do it asap.~~

~~No need to do this for v3.0, but I'm definitely keen to do it asap.~~

~~No need to do this for v3.1, but I'm definitely keen to do it asap.~~

~~No need to do this for v3.2, but I'm definitely keen to do it asap.~~

~~No need to do this for v3.3, but I'm definitely keen to do it asap.~~

~~No need to do this for v3.4, but I'm definitely keen to do it asap.~~

No need to do this for v3.5, but I'm definitely keen to do it asap.

:rofl:

Jan 10 '22 14:01 bjlittle

I just didn't want us to squirrel this one away and forget about it before it hurts us (again)...

👀

Jan 10 '22 17:01 rcomer

@rcomer I know :laughing:

Given the level of angst on #4486, it would be ideal to perhaps take the hit square on the chin and deprecate/rewrite iris.util.format_array in iris 3.2. in preference to using something along the lines of numpy.array2print(..., legacy="1.13") instead afterwards.

I suspect adopting such a change will cause wide sweeping changes in iris, which I'm totally willing to wade through rather than rinse and repeat the current experience that we have at the moment.

However, the silver lining here is that this is a really lovely example of why it's not clever to use private functionality; one day it will burn you.

Shame we keep putting our hand into the fire :hand: :fire: :cry:

Anyways, I'm off to the print shop to get my Always stick to the public API T-shirt... want one?

They might do bulk discounts. Win :+1:

Jan 11 '22 09:01 bjlittle

Discussed just now by : @pp-mo @bjlittle @trexfeathers

We think we would like to change this soon, if not to something necessarily more stable, then at least to a public routine - like, probably numpy.array2string. Practically that seems quite do-able but will ..

break all the xml/cdl tests (I count 409)
change all their reference-result files tests/results/.../*.{cdl|xml} (about 120).

It would also be possible to adopt a more statistics-based or array "fingerprinting" approach (*) so as to shrink the XML, -- this would need a controlled sensitivity to numerical changes, which is not a trivial problem.

It clearly needs some thought, so let's just not rush it.

Jan 11 '22 10:01 pp-mo

It would also be possible to adopt a more statistics-based or array "fingerprinting" approach so as to shrink the XML, -- this would need a controlled sensitivity to numerical changes, which is not a trivial problem.

Re: "fingerprinting" : by which I mean (ideally) some kind of summary of array values, much smaller than the whole data, sensitive to any individual value changing, but toleranced for floating-point. N.B. not really a "hash" concept as that usually focusses on detecting even the smallest changes (though despite that, an idea very much like 'imagehash' !)

Effectively, what we are currently doing is to use the numpy array2string representation to choose a suitable common format = output precision, and outputting all the numbers to that precision : that string is our data summary.

I haven't managed to find any very accepted existing approach for this, though, except that "fingerprint" seems to be a recognised term for the general concept : see https://en.wikipedia.org/wiki/Fingerprint_(computing) ;

Jan 12 '22 09:01 pp-mo

What does the XML approach offer that we couldn't get from saving to NetCDF? Is reliance on the netCDF4 package the main problem there?

Because it seems from these problems that even the XML solution encounters dependency issues - possibly more difficult ones.

Jan 12 '22 11:01 trexfeathers

iris iris copied to clipboard

unpin numpy legacy printing

iris
iris copied to clipboard