APEx reference check needs better representation (ascii art diff, diff image, statistics)
Statistics are most important #percentage of pixels that are different (tolerance) median, mean, min, max Perhaps write diff to a png -- Issue with dimensionality (per timestamp, per band) max 10 files.
Location of differences Which timestamp(s) failed Which band(s) failed bbox of differences as geojson (so you can crop)
- it happens that changes are very concentrated
Extras Numpy has something similar. We have unit tests that already print more informative messages.
Pipeline for cdse staging might not be configured yet for writing to s3 bucket. Ask @JeroenVerstraelen.
FYI:
initial comparison/assert utilities are defined in python client: https://github.com/Open-EO/openeo-python-client/blob/master/openeo/testing/results.py
test coverage of that (to give an idea about behavior): https://github.com/Open-EO/openeo-python-client/blob/master/tests/testing/test_results.py
Usage in APEx benchmark test suite is basically just this: https://github.com/ESA-APEx/apex_algorithms/blob/91eb3e62cc9e6d179b655baa97132771b5b0c6e8/qa/benchmarks/tests/test_benchmarks.py#L74-L80
Definition of done: Integrate into error logs of automatically generated github issues e.g. https://github.com/ESA-APEx/apex_algorithms/issues/164
just ran a APEx benchmark locally with this new ascii art feature, and it seems to work, bit the it's size (100x100 I guess) is too large I think. It will not render properly in various contexts.
I would keep it within reasonable size, e.g. 40x40 or something like that
> raise AssertionError("\n".join(issues))
E AssertionError: Issues for metadata file 'job-results.json':
E Differing 'derived_from' links (22 common, 2 only in actual, 2 only in expected):
E only in actual: {'/eodata/Sentinel-2/MSI/L2A_N0500/2023/09/29/S2A_MSIL2A_20230929T103821_N0510_R008_T31UFS_20241106T062041.SAFE', '/eodata/Sentinel-2/MSI/L2A_N0500/2023/09/27/S2B_MSIL2A_20230927T104719_N0510_R051_T31UFS_20241027T094215.SAFE'}
E only in expected: {'/eodata/Sentinel-2/MSI/L2A/2023/09/29/S2A_MSIL2A_20230929T103821_N0509_R008_T31UFS_20230929T154658.SAFE', '/eodata/Sentinel-2/MSI/L2A/2023/09/27/S2B_MSIL2A_20230927T104719_N0509_R051_T31UFS_20230927T135345.SAFE'}.
E Issues for file 'openEO.tif':
E band 1: value difference exceeds tolerance (rtol 1e-06, atol 1e-06), min:0.0013195276260375977, max: 0.012642286717891693, mean: 0.01, var: 0.0
E band 1: differing pixels: 65/33369 (0.2%), spread over 0.0% of the area
../../../../openeo/openeo-python-client/openeo/testing/results.py:544: AssertionError
--------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------
0:00:00 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': send 'start'
0:00:14 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': created (progress 0%)
0:00:19 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': created (progress 0%)
0:00:25 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': created (progress 0%)
0:00:34 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': created (progress 0%)
0:00:44 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:00:56 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:01:12 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:01:31 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:01:55 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:02:26 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': running (progress N/A)
0:03:03 Job 'cdse-j-25050713153049cf8fe4fb9dddef12b5': finished (progress 100%)
Difference ascii art for band 1
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ } │
│ │
│ p&*h >l │
│ zj _ │
│ wa c │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ }n │
│ k │
│ * │
│ Z │
│ dp │
│ Xp │
│ pw │
│ ph │
│ d}u │
│ | │
│ p │
│ k$ │
│ Wq : │
│ :a │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ h │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ b| │
│ !: │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ U │
│ │
│ │
│ } │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ J │
│ UJ │
│ 1Y nt │
│ L UC │
│ l\ r │
│ L w? │
│ ]X │
│ wM │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
=============================================================================================== warnings summary ================================================================================================
../../../../openeo/openeo-python-client/openeo/testing/results.py:99
/home/lippenss/src/openeo/openeo-python-client/openeo/testing/results.py:99: SyntaxWarning: invalid escape sequence '\|'
grayscale_characters = "$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\"^`'. "
FYI: I missed part of the discussion this morning, but to test this feature on APEx benchmarks, it's not necessary to go through the cumbersome process of merging an WIP PR, doing a release, waiting for the benchmarks to run automatically, etc.
It's not that hard to just test this locally.
Roughly (mutatis mutandis):
# Set up local context for benchmarks
git clone https://github.com/ESA-APEx/apex_algorithms.git
cd apex_algorithms
# Create isolated venv for benchmarks project
python -m venv --prompt . venv
. venv/bin/activate
python -m pip install qa/tools
python -m pip install -r qa/benchmarks/requirements.txt
# Override openeo package with local checkout, having necessary changes in feature branch
python -m pip install ../path/to/local/openeo-python-client
# Optionally: go back to commit with known benchmark errors here
# Run a benchmark
cd qa/benchmarks
pytest -vv -k 'test_run_benchmark[max_ndvi]'
Note that the first time you do this, you have to watch out for authentication instructions
Definition of done: Integrate into error logs of automatically generated github issues e.g. https://github.com/ESA-APEx/apex_algorithms/issues/164
FYI: There should be no effort to integrate that, it should work out of the box I think
It's even possible to use the experimental version from #761 directly in the benchmark runs without having to wait for that PR to be merged and creating a release of it. Just merged this in main the apex benchmarks: https://github.com/ESA-APEx/apex_algorithms/commit/5554b6a5f8e5551ad7ecfe8a125ebb87e5e77eb2
I think it makes sense to just run the benchmarks like this for now
manually triggered a run (ran the "bap_composite" benchmark): https://github.com/ESA-APEx/apex_algorithms/actions/runs/14906019298/job/41868461282
which shows the ASCII art:
height of the ASCII art is still too much I think