Mark Grondona comments

Results 555 comments of


                                            Mark Grondona

job-list: support "hostlist" constraint to allow jobs to be filtered by nodes

1024 might be a bit small on a cluster with 10K nodes. Something dynamic might work if job-list can get the maximum expected instance size. However, this doesn't prevent DoS...

job-list: support "hostlist" constraint to allow jobs to be filtered by nodes

> the limit is instance size (ie number of brokers) or 1024, whatever is bigger. the 1024 minimum is to give the constraint some decent minimum This seems reasonable to...

job-list: support "hostlist" constraint to allow jobs to be filtered by nodes

Just curious, how are you going to match on hosts with older jobs in a database. Maybe the optimization won't be needed in that case if the hosts are indexed...

Add examples of bootstrapping under sbatch

Suggestion: As part of the docs include a "sanity check" command that a new user can run to verify that Flux has discovered all expected resources (e.g. run `flux resource...

not ok - tbon.endpoint cannot be set

This has be reproducing lately and we now get the extra information about what signal terminated the broker: ``` expecting success: test_must_fail_or_be_terminated flux start ${ARGS} -s2 \ --setattr=tbon.endpoint=ipc:///tmp/customflux true flux-broker:...

view job constraints in `flux jobs`

Support will have to be added to `job-list` to capture job constraints if we want to have general access to this data, since only the job and instance owner can...

view job constraints in `flux jobs`

We'll have to figure out if there are any extra constraints since the queue constraints are applied to any existing constraints with "and" in the job frobnicator. Possibly it may...

WIP: job-manager: add support for housekeeping scripts with partial release of resources

That's a cool idea! While there would be some interesting benefits, I did consider some issues: Would we end up in the same place because jobs don't currently support partial...

WIP: job-manager: add support for housekeeping scripts with partial release of resources

As a simpler, though less interesting, alternative, we could add a new resource "state" like `allocated` or `free`. I'm not sure I like "houskeeping" but maybe call it `maint` or...

WIP: job-manager: add support for housekeeping scripts with partial release of resources

> Perhaps in the spirit of prototyping I could try to tack something on here as a proof of concept if the idea isn't too outlandish. It does not seem...