flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

WIP: add `hostlist` and `rank` job constraint operators, extend `flux-mini --requires=` to allow use of operators

Open grondo opened this issue 3 years ago • 1 comments

This PR is a demonstration-of-concept for the hostlist and rank operators for RFC 31 job constraints. These operators allow inclusion/exclusion of resources by hostname and/or rank, in combination with existing property constraints. Note that this won't work with a queues configuration until flux-framework/flux-sched#942 is solved, because the job-frobnicator plugin currently only supports property based constraints for Fluxion's sake.

However, this WIP PR may serve as a useful implementation reference for flux-framework/flux-sched#942 and flux-framework/flux-sched#965, so I'm putting it up for early examination.

Since there is currently no way to easily create generic RFC 31 constraint objects, I've also included a quick and dirty flux.job.Constraint class which can "parse" a boolean query like syntax into a job constraint object, and then used this class to process the argument to the flux mini --requires option.

I threw something together quickly, so while it works for the most part, we should probably eventually replace it with something that uses a real parser so that more error conditions are caught. For now, this is meant to be a test vehicle.

The syntax is somewhat straightforward:

  • "and" and "or" binary operators
  • "not" unary operator
  • there are no real precedence rules, so parentheses are supported to enforce
  • words joined by a single colon specify an operator and its argument, e.g. operator:arg
  • if there is no colon, then properties is assumed, e.g. foo = properties:foo. This allows existing syntax to continue to work, e.g. --requires=foo,bar
  • if operator is preceded with a -, then this is equivalent to not operator:arg
  • host and hostsare shorthand for hostlist and rank can be used in place of ranks

e.g.

  • --requires=rank:0-1
  • --requires="foo or rank:0-1"
  • `--requires="-host:badhost"
  • `--requires="foo or (not rank:7)"

etc.

If this all seems acceptable, then probably RFC 20 should be updated to add : to the list of invalid characters in property names. Then we can update to RFC 31 to require the list of operators supported here.

grondo avatar Oct 12 '22 19:10 grondo

Works for me!

garlick avatar Oct 13 '22 16:10 garlick

This looks pretty good to me too. How do we interpret mixed commas and operators though, if we already allow foo,bar, do we need to allow foo,host:corona296 and not_on_fire? I ask mainly so we can figure out what the placement and priority would be on that, maybe we could make , act like and, or deprecate the comma so it's consistent?

trws avatar Oct 20 '22 00:10 trws

Great points, I hadn't thought of that.

What I was thinking is that tokens are simply separated by whitespace, and, or, and not are keywords, otherwise you have an [operator:]args where if operator is left off then this implies the properties operator. Then properties:foo,bar,baz turns into {"properties": ["foo", "bar", "baz"]} in the RFC 31 constraint object. This allows host:fluke[1,3,5], rank:0-5,10, etc.

Thus foo,host:corona296 would be treated as a token and fail compilation because there is no supported operator foo,host?

This seems simplest to me, though I'm not sure if it is consistent as you say.

grondo avatar Oct 20 '22 00:10 grondo

I like that, I think it's just as consistent as the other option, it makes , just part of the token which seems fine. Would we also want to support the square brackets? I have a feeling we'd want to support quoting a token for some things too, though that could start to get gnarly, maybe just not supporting spaces for now is reasonable. I think that would leave us with roughly:

ident=/(\w=,]\[)+ ?/
operator="or "|"and "|"not "|"^"
prop= /\w+:/ ident
token=operator | prop
expr=token+

trws avatar Oct 20 '22 01:10 trws

Nice summary, I was currently using - as the shorthand to negate an entire operator expression. For good or bad ^ is currently used to negate a single property --requires=foo,^bar :shrug:

grondo avatar Oct 20 '22 01:10 grondo

Ah, I mixed up the RFC and the syntax, here's an update. I'm actually thinking it makes sense to make that property-only, otherwise it reads strangely to me. If it were ! then I would assume it would negate an entire expression, but - interprets as negating a single term if I read it without thinking about it.

ident=/[\w-](\w-=]\[)*/
idents=ident ("," ident)*
operator="or"|"and"|"not"
prop="-"? ident 
props=prop ("," prop)*
properties="properties:"? props
term= /\w+:/ idents
token=operator | properties | term
expr=token (" " token)*

We wouldn't actually need the properties split out in the grammar since they'd be valid terms with the leading "-", but we should decide if we want to allow embedded dashes or not. This has the advantage it would be really easy to make a small parser for, especially with a little PEG lib or something which would be good for lots of other things.

trws avatar Oct 20 '22 15:10 trws

I was hoping to make exclusion of a set of hosts or ranks simple (akin to gmail or google search syntax) with e.g. --requires=hosts:fluke[1,3-5] to constrain to a hostlist or --requires=-hosts:fluke[1,3-5] to exclude the same hosts, rather than requiring the user to type --requires="not hosts:fluke[1,3-5]". However, if the former is confusing then the extra required quoting and 3 characters is not a dealbreaker for me.

grondo avatar Oct 20 '22 15:10 grondo

We could always make it support --requires="hosts:-fluke[1,3-5]".
Some shells will need the quoting to tolerate the brackets due to glob syntax, then we can support multiple values passed to each term either negated or not. That makes it so we don't have to deal with things like --requires="on_fire and -(hosts:...[1,5] or has_gpu)", which is where the negation really gets me. I could probably be convinced, but allowing it on the whole thing rather than the components feels odd.

trws avatar Oct 20 '22 15:10 trws

You make a good point. Let's keep it simple for now. I like your suggestion of negating the hostlist not the entire term.

grondo avatar Oct 20 '22 15:10 grondo

@trws, thinking about this a little more:

hosts:fluke[1,3-5] turns into {"hostlist": [ "fluke[1,3-5]" ]}, so is it a bit weird that hosts:-fluke[1,3-5] compiles to {"not": [{"hostlist": [ "fluke[1,3-5]" ]}]}?

When I started looking at changing the current implementation it just made more sense to have -operator wrap the operator and its args in not, rather than having to parse the argument for a leading - and then wrap the operator in not.

In fact, this could get confusing in some situations where you have ranks:-0,5 does this mean ranks:5 and not rank:0 or not rank:0 and not rank:5? Whereas with -ranks:0,5 this clearly means the latter, and if you want the former, you can do either -rank:0 and rank:5 or the expanded statement above.

--requires="on_fire and -(hosts:...[1,5] or has_gpu)", which is where the negation really gets me

What if we just don't support negation in front of left parens? I.e. the - can only be used in an operator name when being used as operator:args to indicate the negation of that operator, and all we do internally is act as if -operator:args is not operator:args?

Does that make any sense, or am I off in the weeds again?

grondo avatar Oct 20 '22 22:10 grondo

Updated to use the constraint parser from #4871 in flux mini --requires. Also took a stab at describing the enhanced supported constraint syntax in flux-mini(1).

This is now based on top of #4871.

grondo avatar Jan 29 '23 02:01 grondo

Codecov Report

Merging #4677 (68fa449) into master (1ff34c7) will increase coverage by 0.02%. The diff coverage is 92.15%.

@@            Coverage Diff             @@
##           master    #4677      +/-   ##
==========================================
+ Coverage   82.86%   82.89%   +0.02%     
==========================================
  Files         425      426       +1     
  Lines       74814    75143     +329     
==========================================
+ Hits        61996    62290     +294     
- Misses      12818    12853      +35     
Impacted Files Coverage Δ
src/common/librlist/match.c 90.00% <89.50%> (-8.86%) :arrow_down:
src/bindings/python/flux/constraint/parser.py 93.88% <93.88%> (ø)
...python/flux/job/frobnicator/plugins/constraints.py 97.50% <100.00%> (+0.20%) :arrow_up:
src/cmd/flux-mini.py 94.32% <100.00%> (+0.05%) :arrow_up:
src/common/librlist/rlist.c 84.79% <100.00%> (+0.04%) :arrow_up:
src/common/libsubprocess/fork.c 75.38% <0.00%> (-1.54%) :arrow_down:
src/broker/module.c 75.05% <0.00%> (-0.42%) :arrow_down:
src/broker/overlay.c 84.15% <0.00%> (-0.40%) :arrow_down:
src/bindings/python/flux/job/Jobspec.py 90.08% <0.00%> (+0.21%) :arrow_up:
... and 2 more

codecov[bot] avatar Jan 29 '23 20:01 codecov[bot]

Closing this PR because the original description is a bit outdated, and I think it may be good to split this into two separate PRs:

  • support hostlist and rank operators in librlist/sched-simple
  • add rfc35 parser to flux mini --requires

grondo avatar Jan 30 '23 00:01 grondo