Mark Grondona
Mark Grondona
1024 might be a bit small on a cluster with 10K nodes. Something dynamic might work if job-list can get the maximum expected instance size. However, this doesn't prevent DoS...
> the limit is instance size (ie number of brokers) or 1024, whatever is bigger. the 1024 minimum is to give the constraint some decent minimum This seems reasonable to...
Just curious, how are you going to match on hosts with older jobs in a database. Maybe the optimization won't be needed in that case if the hosts are indexed...
Suggestion: As part of the docs include a "sanity check" command that a new user can run to verify that Flux has discovered all expected resources (e.g. run `flux resource...
This has be reproducing lately and we now get the extra information about what signal terminated the broker: ``` expecting success: test_must_fail_or_be_terminated flux start ${ARGS} -s2 \ --setattr=tbon.endpoint=ipc:///tmp/customflux true flux-broker:...
Support will have to be added to `job-list` to capture job constraints if we want to have general access to this data, since only the job and instance owner can...
We'll have to figure out if there are any extra constraints since the queue constraints are applied to any existing constraints with "and" in the job frobnicator. Possibly it may...
That's a cool idea! While there would be some interesting benefits, I did consider some issues: Would we end up in the same place because jobs don't currently support partial...
As a simpler, though less interesting, alternative, we could add a new resource "state" like `allocated` or `free`. I'm not sure I like "houskeeping" but maybe call it `maint` or...
> Perhaps in the spirit of prototyping I could try to tack something on here as a proof of concept if the idea isn't too outlandish. It does not seem...