jwst icon indicating copy to clipboard operation
jwst copied to clipboard

use GitHub Actions to cache environments and use dependent CI workflow structure

Open zacharyburnett opened this issue 2 years ago • 4 comments

image

Notable changes:

  • workflow jobs are defined in GitHub Actions to simplify the environment caching and construction
  • all environments are cached to speed up subsequent runs; cache key is set to hash the build files so that if a build requirement changes the environment is built from scratch
  • matrices are dependent in a workflow such that a commit that fails style checks will not run tests, and so on
  • matrices test a more comprehensive set of platforms and Python versions
  • newest CRDS operational context is cached to speed up testing

Checklist

  • [x] Tests
  • [ ] Documentation
  • [x] Change log
  • [ ] Milestone
  • [x] Label(s)

zacharyburnett avatar May 31 '22 16:05 zacharyburnett

waiting on #6871

zacharyburnett avatar Jun 07 '22 14:06 zacharyburnett

Resolves JP-2157

How does this fix JP-2157? Seems like this is strictly related to CI tests for PRs.

hbushouse avatar Aug 18 '22 13:08 hbushouse

Resolves JP-2157

How does this fix JP-2157? Seems like this is strictly related to CI tests for PRs.

Oh whoops, that was accidental. I'll remove that

zacharyburnett avatar Aug 18 '22 13:08 zacharyburnett

This looks nice. Should really speed up the env setup.

One hiccup I recall in setting up CRDS caching to actually work between builds where the default context hadn't change was to make sure that the CRDS cache action is only happening when pytest is actually running. If CRDS caching happens during code style checks or bandit, those don't actually pull any of the reffiles that the tests need, and because those jobs finish first, they produce the cache before the pytest runs do, leading to successful caching of precisely zero CRDS files. Hopefully you can avoid that here.

jdavies-st avatar Aug 22 '22 09:08 jdavies-st

One hiccup I recall in setting up CRDS caching to actually work between builds where the default context hadn't change was to make sure that the CRDS cache action is only happening when pytest is actually running. If CRDS caching happens during code style checks or bandit, those don't actually pull any of the reffiles that the tests need, and because those jobs finish first, they produce the cache before the pytest runs do, leading to successful caching of precisely zero CRDS files. Hopefully you can avoid that here.

@jdavies-st Does this apply to running crds sync on the latest context, like the following?

crds sync --contexts jwst_1009.pmap

It appears to be caching 438591 bytes at the moment

zacharyburnett avatar Oct 27 '22 17:10 zacharyburnett

I've also added a workflow_dispatch: that allows the unit testing workflow to be run from the web client using a specific CRDS context

zacharyburnett avatar Oct 27 '22 17:10 zacharyburnett

You definitely don't want to use crds sync, because it will pull all items for that context. And that's a lot of data. 100s of GB.

The way the CRDS caching should work is that workflow A runs the unit tests, pulling in all the CRDS reffiles it needs. At the end of the workflow, the state of the crds_cache directory should be cached, and that should be used for the next workflow.

The trick is to only github actions cache the workflows that actually run pytest on the whole suite of tests, otherwise you will cache nothing, and you will be restoring empty caches. And make sure the cache key is something that changes occasionally but not too often, or it defeats the purpose of the caching. You could have the key be the CRDS context, but I don't think that's a very good proxy for which CRDS reffiles change for the jwst unit tests. I.e. the unit test reffiles are pretty stable, whereas CRDS context can change almost every day. And even if the github actions cache is a bit out-of-date with the current state of the CRDS cache, that's not a problem as it will pull the new files over. You could just hash the reference files in that dir (not the pmaps or rmaps) for a key.

jdavies-st avatar Oct 28 '22 08:10 jdavies-st

The Github cache action was removed here

https://github.com/spacetelescope/jwst/pull/6165

It should be turned on again, making sure it only runs on jobs that run pytest, and it gets restored based on a hash of $HOME/crds_cache/references. 👍

jdavies-st avatar Oct 28 '22 09:10 jdavies-st

closing in favor of successor #7323

zacharyburnett avatar Oct 28 '22 14:10 zacharyburnett