distributed
distributed copied to clipboard
Should `utils_test.py` be considered public API?
We currently expose a number of functions and fixtures in distributed.utils_test. While I would consider these implementation details of our tests, and not part of any stable user-facing API, other projects, including dask/dask rely on them for their test suites (https://github.com/dask/dask/blob/ac74939697c328bd6456aa9fad7d9520328db6e5/dask/tests/test_distributed.py#L11-L14, #5300, #6775, https://github.com/dask/distributed/pull/6802#pullrequestreview-1052813812). This creates friction between the stability requirements of downstream users and our ability to adjust the contents of utils_test to fit our internal testing needs. For example, we cannot easily move functionality around without breaking things for downstream users (https://github.com/dask/distributed/pull/6802#pullrequestreview-1052813812).
This leads me to several questions:
Should the contents of utils_test.py be considered part of the public API?
If so:
- What guarantees are we willing to make around API stability?
- How do we want to test/enforce those?
- Which parts should be considered public API?
- For example, should we only expose context managers/functions, but not fixtures?
If not:
- How do we make this clear and discourage users from importing them?
- How do we deal with projects that already import them?
cc: @jrbourbeau, @graingert
Apart from the question whether we should consider the contents of distributed.utils_test public API, I'd argue that the module itself suggests that it is. Compare these two alternatives:
distributed.utils_testsounds like something we ship to help with tests.distributed.tests.utilssounds like test helpers for our own sake.
Thanks for raising this @hendrikmakait. While I definitely don't think all the contents of utils_test.py should be public, there are some methods like gen_cluster, and possibly the cluster context manager (maybe?), that are used in several other projects and I think historically have been viewed as public. For example, xarray, streamz, dask, dask-ml all use gen_cluster (I'm sure there are more than I'm not aware of). I've personally found gen_cluster to be very convenient when writing tests for projects that use Dask clusters
This creates friction between the stability requirements of downstream users and our ability to adjust the contents of utils_test to fit our internal testing needs
Do you have a sense for how much friction there is here? To be clear, I think being more intentional about what we consider public is a good thing. I'm just wondering how big of a problem this is today
I wouldn't consider them to be public to users necessarily, but certainly they're public to other dask libraries and other friends-of-dask like Xarray. From my perspective those libraries are within the bounds of Dask. If we change internals in utils_test then I would expect us to also reach out to and fix their use elsewhere, or at least collaborate with those libraries.
Do you have a sense for how much friction there is here?
Honestly, I can't really quantify this. What I can say is that I have stumbled across this twice this week and when talking to other developers they were either unhappy with the situation or uncertain about answering that question.
It sounds like we might profit from moving most of the functions and fixtures elsewhere and reimporting those that others rely on back into utils_test.py, and maybe make sure that we properly document those as well as adding a module-scope docstring describing who this is intended for? This might clear things up without too much effort.
@graingert also might have some interesting thoughts on all this.
Given that we rely on these things in a bunch of places I would argue that today this is effectively a public API and we should treat it as such (no breaks, use deprecations, etc). However there is probably way more in there than there should be, so maybe we should try and be more intentional about what we expose going forwards.
Other libraries I've used recently like kopf explicitly have a kopf.testing submodule where you can grab test utils to use in your own tests. Perhaps we should shift more towards that and be more explicit. Then put everything else somewhere like distributed._utils_test so that it is clearly private.
@jacobtomlinson: Thanks for your input and for linking kopf.testing. That's pretty much what I had in mind with my comment above.
+1 for a dedicated testing module with a limited public API and the deprecation of utils_test
My usecase for these testing utilities is kind of subtle,
I like to be able to make changes to the testing infrastructure to increase strictness I also like to be able to push that increase in strictness into friends-of-dask and see if I broke them also so I can see if there's any bits of distributed are being used that break that strictness but isn't getting tested
eg we now test that it's possible to run asyncio.run(something_in_distributed()); asyncio.run(something_in_distributed()); and that change is breaking friends-of-dask because code in distributed wasn't actually compatible with that pattern when used in a legitimate way I didn't anticipate
As an aside it's really painful for us to add new fixture dependencies to our 'exported' fixtures because they need to be re-imported completely in a way that appears to violate flake8. Which is why I'd like this new API to use contextmanagers in place of fixtures
Which is why I'd like this new API to use contextmanagers in place of fixtures
I also support this. I don't consider exporting fixtures a "sane" API. It's way too difficult to control imo.
I wouldn't consider them to be public to users necessarily, but certainly they're public to other dask libraries and other friends-of-dask like Xarray. From my perspective those libraries are within the bounds of Dask. If we change internals in utils_test then I would expect us to also reach out to and fix their use elsewhere, or at least collaborate with those libraries.
This reflects my view, I don't think we should attempt all that much stability in this API but we should still encourage people to use it for their CI.
If you use it in CI, when we make a change:
- it will probably cause your test suite to fail
- but any failure we cause should be actionable
- and we should come help you fix it
- and most importantly you should always be able to copy paste an old version to unblock any PRs and issues with higher priority
Agreed, @graingert. What I would like is some distinction between a smaller module where changes might break stuff for others and a larger module where things will only break distributed. This would reduce mental load and I can make a conscious decision about introducing breaking changes (and consciously breaking things is a very valid decision here). Right now, I feel like we follow the rule: "If everything is public, then nothing (really) is."
+1 on getting rid of fixtures in the public API.
We should also help friends-of-dask enable filterwarnings=error so they are notified about these deprecations