hdk icon indicating copy to clipboard operation
hdk copied to clipboard

[CI] Create and verify building env once during nightly.

Open Devjiu opened this issue 1 year ago • 7 comments

Currently each pr check jobs tries to do 2 things:

  1. create & verify env,
  2. verify code from pr Cache in github updates after successful job, so build.yml calls conda update and than, if build succeeded it updates cached binaries. So env currently changed on each success of job on each pr. (Also conda update is called in modin.yml and pytest.yml)

More generic and fast way - is to create env during nightly build, verify it and than use during all jobs.

It will allow to speed up pr jobs (~ 4 mins taken for such update) and simplify most of jobs.

Devjiu avatar Apr 06 '23 12:04 Devjiu

Also connected with #351. Solution should reduce code duplication in terms of caching.

Devjiu avatar Apr 06 '23 13:04 Devjiu

Last successful scheduled build can be reached with something like https://api.github.com/repos/intel-ai/hdk/actions/workflows/main.yml/runs?per_page=1&event=schedule&status=success. Can't currently verify, due to API rate limit exceeded for 123.111.00.00. error.

Devjiu avatar Apr 06 '23 13:04 Devjiu

We would need to have some sort of key so the build could be automatically updated when we change the env. Perhaps we could hash the conda env file, similar to how we cache Docker builds?

alexbaden avatar Apr 06 '23 15:04 alexbaden

We would need to have some sort of key so the build could be automatically updated when we change the env. Perhaps we could hash the conda env file, similar to how we cache Docker builds?

Agree, last success nightly job id + env file hash to save. There is also the question - should a nightly build be run to change the env or should it be updated during the regular pr build.

Devjiu avatar Apr 06 '23 15:04 Devjiu

There is a difference between a docker image and conda env; the latter should be regularly updated, the hash is not enough because we have some packages which are not pinned

currently the env is updated once per day unless someone changes the yml file

if we move the update to the nightly build what should we do when someone updates the yml descriptor? (I've made an implementation for a docker cache and the same consideration makes me avoid using nightly builds; when someone changes the file it should be rebuilt during the run) this differs from the idea which we have discussed with Dmitry, I've realized the problem with updating descriptors later

leshikus avatar Apr 07 '23 22:04 leshikus

If we really need a source check to run ASAP then how https://github.com/intel-ai/hdk/pull/345 can be implemented correctly?

it should not create cache itself because the cache needs to be validated using the build here what comes to my mind

check steps.conda-cache.cache-hit

  1. if the cache is hit, do not run conda update, just run the source check; this path will be taken most of the time

  2. if the cache is not hit, delay the check to happen after the build (like it works now) when it gets a new conda env cache in place

leshikus avatar Apr 07 '23 22:04 leshikus

If we really need a source check to run ASAP then how #345 can be implemented correctly?

it should not create cache itself because the cache needs to be validated using the build here what comes to my mind

check steps.conda-cache.cache-hit

  1. if the cache is hit, do not run conda update, just run the source check; this path will be taken most of the time
  2. if the cache is not hit, delay the check to happen after the build (like it works now) when it gets a new conda env cache in place

Agree. Style check logic in this case will be complicated, let's solve the problem with env first, the behavior of the style check job will be discussed separately.

Devjiu avatar Apr 11 '23 11:04 Devjiu