community icon indicating copy to clipboard operation
community copied to clipboard

Integration test suite before release

Open mrocklin opened this issue 3 years ago • 12 comments

We currently run a test suite on every commit. These tests are designed to be focused and fast.

However, when we release we may want to consider running some larger workflows that test holistic behavior. This might include, for example, reading a 10,000 partition dataset of parquet data from S3, something that is important, but not something that we want to put into our test suite. This might also be a good place to include workflows from downstream projects like RAPIDS and Xarray.

This would be something that the release manager would be in charge of kicking off.

Some things that we would need to do

  • Figure out if we want to add this process
  • Build a CI system
  • Add tests

cc @jacobtomlinson @quasiben @jrbourbeau

mrocklin avatar Jun 07 '21 19:06 mrocklin

cc @rabernat, who would like to see some similar workflows at large scales in the context of pangeo-forge.

martindurant avatar Jun 07 '21 19:06 martindurant

xref https://github.com/pangeo-data/pangeo-integration-tests/issues/1

rabernat avatar Jun 07 '21 20:06 rabernat

It might make sense to also consider other downstream projects like spatialpandas and dask-geopandas to help catch issues like https://github.com/holoviz/spatialpandas/issues/68 and https://github.com/geopandas/dask-geopandas/issues/49.

BTW, I really appreciate seeing the assistance those projects received to help address those issues, really cool to see that kind of community support. Big thanks to everybody contributing to and supporting this awesome ecosystem.

brl0 avatar Jun 07 '21 20:06 brl0

It'd be interesting to think about how you pass a test suite like that. For instance, is performance a part of it? It would be very interesting to publish benchmarks with each release. It seems less common that a release actually breaks the read 10_000 files from parquet case, and more common that it introduces a performance regression.

jsignell avatar Jun 07 '21 20:06 jsignell

Would also suggest adding some Dask projects to this list like Dask-ML, Dask-Image, etc. At least with Dask-ML we have seen a couple breakages recently that probably could have been avoided with integration testing.

jakirkham avatar Nov 09 '21 00:11 jakirkham

Might even just be worthwhile to do runs of these tests every 24hrs or so. This can help identify issues a bit sooner than a release giving people more time to fix and update.

Numba did some work in this space that we might be able to borrow from: texasbbq

Also having nightlies ( https://github.com/dask/community/issues/76 ) would help smooth out the integration testing process and aid in local reproducibility

jakirkham avatar Nov 09 '21 00:11 jakirkham

It looks like @jrbourbeau started getting dask set up with texasbbq a few years back :) https://github.com/jrbourbeau/dask-integration-testing

jsignell avatar Nov 09 '21 14:11 jsignell

Might even just be worthwhile to do runs of these tests every 24hrs or so.

A while ago, I requested a bunch of projects downstream of xarray to run their test suite regularly against xarray HEAD. It has really helped catch issues before release.

Perhaps a bunch of downstream projects can do the same with dask HEAD. Here's the current xarray workflow: https://github.com/pydata/xarray/blob/main/.github/workflows/upstream-dev-ci.yaml It's really nice! It even opens an issue when tests fail with a nice summary.

dcherian avatar Nov 09 '21 16:11 dcherian

This raises another good point. Maybe it is worth just adding some jobs to the dask-* projects to test Dask + Distributed latest. Idk to what extent these exist now (so feel free to say where this is needed/done). We could then add a cron job as well (especially for some of the more stable dask-* projects) to run overnight with the latest changes. IDK if there is a GH Action to raise cron job failures in an issue, but that might be a good way to raise visibility about anything that breaks overnight

jakirkham avatar Nov 11 '21 20:11 jakirkham

IDK if there is a GH Action to raise cron job failures in an issue, but that might be a good way to raise visibility about anything that breaks overnight

Yeah there are definitely ways to raise issues from GitHub Actions. I wonder where a good place to open the issue would be? For projects like dask-kubernetes it might be distributed and for projects like dask-sql it might be dask?

jacobtomlinson avatar Nov 12 '21 10:11 jacobtomlinson

Even if they are raised on the projects themselves that could also be useful. Basically just thinking of how we make the CI failure more visible. Red Xs can easily be missed

jakirkham avatar Nov 12 '21 10:11 jakirkham

We have already copied the xarray upstream infrastructure on dask/dask. There is an upstream action that runs every night and raises an issue with any failures. Here's the yaml for that https://github.com/dask/dask/blob/main/.github/workflows/upstream.yml

jsignell avatar Nov 15 '21 17:11 jsignell