opam-repo-ci icon indicating copy to clipboard operation
opam-repo-ci copied to clipboard

Only send a limited number of jobs at a time to the cluster to avoid stressing it too much

Open kit-ty-kate opened this issue 5 years ago • 6 comments

@talex5 would this help?

kit-ty-kate avatar Nov 03 '20 12:11 kit-ty-kate

Yes, but we do want it to use the cluster fully when the cluster isn't busy. Having 500 jobs at once is fine, as long as nothing else needs to use the cluster at the same time.

Another option might be to mark all revdeps jobs as non-urgent. But really, we need more priorities, e.g.

  1. ocaml-ci or opam-repo-ci main job (new commit pushed)
  2. opam-repo-ci revdeps
  3. ocaml-ci or opam-repo-ci update (rebuild due to an opam-repository merge, etc)
  4. health-check build
  5. base builder image update

talex5 avatar Nov 03 '20 12:11 talex5

Agreed. I opened https://github.com/ocurrent/opam-repo-ci/pull/25 in the meantime to set all revdeps jobs as not urgent.

kit-ty-kate avatar Nov 03 '20 13:11 kit-ty-kate

Now that https://github.com/ocurrent/ocluster/pull/88 this PR shouldn't be needed.

talex5 avatar Nov 16 '20 16:11 talex5

It looks like this proposal may be necessary to avoid taking more RAM than the host server can handle (e.g. when sending > 30_000 jobs at once)

kit-ty-kate avatar Dec 17 '21 17:12 kit-ty-kate

Worth a try. Though I'm not sure where all the memory is actually going. opam-repo-ci has memtrace support compiled in, so it might be an idea to turn that on and find out.

talex5 avatar Dec 17 '21 21:12 talex5

for some reason this makes opam-repo-ci stop doing any work. /jobs shows all the jobs as (ready to start) but none are actually starting. Something to do with the way ocurrent is handling caches maybe?

EDIT: nevermind, it was: https://github.com/ocurrent/ocurrent-deployer/issues/92

kit-ty-kate avatar Dec 18 '21 16:12 kit-ty-kate