cuml
cuml copied to clipboard
[FEA] Test estimators with hypothesis
Many of cuml's tests are aimed at comparing results between different implementations, e.g., the GPU and CPU implementation of the same estimator and against third-party implementations, notably scikit-learn. We expect estimators to overall behave very similarly and their results to be identical up to numerical precision.
Further, estimators are usually tested only against a specific combination of inputs and example datasets, an approach that likely fails to test rare edge cases and cannot provide confidence for the equivalence of a wide-range of inputs and datasets. Using hypothesis to test estimators and compare results has therefore two positive effects:
- The API surface is tested significantly more often against edge cases and extreme values.
- Any validation through the comparison of results is done on a more diverse set of input datasets.
A potential downside is an increase in test implementation complexity and test runtime. The former can be mitigated through a well-designed abstraction of hypothesis-strategies and may then actually lead to a reduction of complexity, the latter can be mitigated through limiting the number of hypothesis iterations and potentially only running hypothesis tests as part of the stress tests.
I suggest the following break-down for implementation:
https://github.com/rapidsai/cuml/issues/4960#tasklist-block-f61b99f0-0662-466c-88bc-cc530ece2119