ENH: add crude assert_allclose; use value testing in vecdot
This should fail with array_api_strict until https://github.com/data-apis/array-api-strict/pull/98 lands.
(verified locally that it passes with numpy and fails with array-api-strict; both with --max-examples 10000)
The CI here passes. It might be because the CI doesn't run enough examples by default and didn't catch the error.
It would be a good idea to run this with a lot of examples. My worry is that we will see the same sorts of flaky falure issues we saw with https://github.com/data-apis/array-api-tests/issues/168 due to loss of significance unless we filter out ill conditioned inputs (as in https://github.com/data-apis/array-api-tests/pull/290).
Of course, we can always just merge this and revert it or fix it if issues crop up. The CI here doesn't run many examples relatively speaking, but the CI for array-api-strict and array-api-compat is where you usually find the errors, because they have such a large matrix that they rarer errors are very likely to show up (that is assuming the test in question isn't already being XFAILed for some reason).
An alternative strategy, in view of https://github.com/data-apis/array-api-tests/pull/314#discussion_r1859369692, could be to only test "exact" things in hypothesis (rename assert_equal to assert_equal_if_int or some such) and add fp value testing only for specific functions with fixed inputs, as regular regression tests without hypothesis.
Yes, it's a possibility. It's not something we've really done yet, but maybe we should think about it (see also https://github.com/data-apis/array-api-tests/issues/284). Non-hypothesis tests have their own issues, which is that they only test specific things. The hypothesis tests have been pretty good at checking a wide range of cases which would definitely be missed with traditional testing. I would definitely strongly encourage the majority of the test suite to continue to use hypothesis. Even something simple like a function with n keyword arguments has something like $2^n$ possible combinations of inputs, and in many cases these combinations do materially affect each other.
And I do agree we should rename functions to be more descriptive of what they are actually doing.