xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Add option to pass callable assertion failure message generator

Open naught101 opened this issue 2 years ago • 10 comments

It is nice to be able to write custom assertion error messages on failure sometimes. This allows that with the array comparison assertions, by allowing a fail_func(a, b) callable to be passed in to each assertion function.

Not tested yet, but I'm happy to add tests if this is something that would be appreciated.

  • [ ] Tests added
  • [ ] Passes pre-commit run --all-files
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

naught101 avatar Jul 15 '21 10:07 naught101

Thanks for this suggestion @naught101 - can you give an example use case for this within the xarray codebase? Or are you just making a general suggestion of something we might find useful?

TomNicholas avatar Jul 15 '21 15:07 TomNicholas

assert_equal and assert_identical are currently calling .equals / .identical and the only thing they really add on top of that is the diff formatting, which is what you want to override. So for those at least it might be better to write your own assert_* functions.

assert_allclose, however, implements functionality that is otherwise unavailable (I think?) so we might want to think about exposing that separately and calling it in assert_allclose.

keewis avatar Jul 15 '21 17:07 keewis

(if we do this we should make it kwarg-only)

max-sixty avatar Jul 15 '21 17:07 max-sixty

@TomNicholas My particular use case is that have datasets that are large enough that I can't see the full diff, so might miss major changes. I'm wanting to pass in something like lambda a, b: f"Largest difference in data is {abs(a-b).max().item()}", so I can quickly see if the changes are meaningful. Obviously a more complex function might also be useful, like a summary/describe table output of the differences..

I know I could set the tolerances higher, but the changes are not numerical errors, and I want to see them before updating the test data that they are comparing against.

Entirely possible that there are better ways to do this, of course :)

naught101 avatar Jul 16 '21 04:07 naught101

My particular use case is that have datasets that are large enough that I can't see the full diff, so might miss major changes. I'm wanting to pass in something like lambda a, b: f"Largest difference in data is {abs(a-b).max().item()}", so I can quickly see if the changes are meaningful. Obviously a more complex function might also be useful, like a summary/describe table output of the differences..

Whether or not we merge this, I would be a +1 on improving the output of the canonical function better, such that we can see differences easily. I would also imagine we can iterate on those without worrying about backward-compat, given this is generating output for people.

As an example, printing the result of the exact statement above for large arrays, maybe also with dimension values, would be great!

max-sixty avatar Jul 16 '21 08:07 max-sixty

I think the more standard way to handle this is to add an argument for supplying an auxiliary err_msg rather than a callback. But this would definitely be welcome functionality!

shoyer avatar Jul 16 '21 16:07 shoyer

@shoyer That would either not work, or be needlessly expensive, I think. The message generation might be expensive (e.g. if I want a sum or mean of the differences). With a call back it only happens if it is needed. With a pre-computed message it would be computed every time.. Correct me if I'm wrong.

naught101 avatar Jul 17 '21 08:07 naught101

Unit Test Results

         6 files           6 suites   53m 43s :stopwatch: 16 172 tests 14 421 :heavy_check_mark: 1 731 :zzz:   20 :x: 90 228 runs  81 957 :heavy_check_mark: 8 151 :zzz: 120 :x:

For more details on these failures, see this check.

Results for commit fe39e36e.

github-actions[bot] avatar Jul 17 '21 18:07 github-actions[bot]

@naught101 to what extent do you think your wants are idiosyncratic vs. better overall? To the extent this is for things that would make the function better for all, then one option is to implement them; e.g. add f"Largest difference in data is {abs(a-b).max().item()}"`.

Possibly that reduces the opportunity to experiment though? People could probably monkey-patch the function in their own environments if they want. And I think we can also be fairly liberal about merging changes given the output is read by people.

max-sixty avatar Jul 17 '21 22:07 max-sixty

@shoyer That would either not work, or be needlessly expensive, I think. The message generation might be expensive (e.g. if I want a sum or mean of the differences). With a call back it only happens if it is needed. With a pre-computed message it would be computed every time.. Correct me if I'm wrong.

I'm mostly thinking of the precedent from the standard library's unittest library and numpy.testing.

It does indeed add a bit of expense to add a custom message, but I guess it's generally acceptable for testing? If not, I would kind of expect to add support for callables in those libraries first.

shoyer avatar Jul 17 '21 22:07 shoyer