flux-core
flux-core copied to clipboard
WIP: add `hostlist` and `rank` job constraint operators, extend `flux-mini --requires=` to allow use of operators
This PR is a demonstration-of-concept for the hostlist and rank operators for RFC 31 job constraints. These operators allow inclusion/exclusion of resources by hostname and/or rank, in combination with existing property constraints. Note that this won't work with a queues configuration until flux-framework/flux-sched#942 is solved, because the job-frobnicator plugin currently only supports property based constraints for Fluxion's sake.
However, this WIP PR may serve as a useful implementation reference for flux-framework/flux-sched#942 and flux-framework/flux-sched#965, so I'm putting it up for early examination.
Since there is currently no way to easily create generic RFC 31 constraint objects, I've also included a quick and dirty flux.job.Constraint class which can "parse" a boolean query like syntax into a job constraint object, and then used this class to process the argument to the flux mini --requires option.
I threw something together quickly, so while it works for the most part, we should probably eventually replace it with something that uses a real parser so that more error conditions are caught. For now, this is meant to be a test vehicle.
The syntax is somewhat straightforward:
- "and" and "or" binary operators
- "not" unary operator
- there are no real precedence rules, so parentheses are supported to enforce
- words joined by a single colon specify an operator and its argument, e.g.
operator:arg - if there is no colon, then
propertiesis assumed, e.g.foo=properties:foo. This allows existing syntax to continue to work, e.g.--requires=foo,bar - if
operatoris preceded with a-, then this is equivalent tonot operator:arg hostandhostsare shorthand forhostlistandrankcan be used in place ofranks
e.g.
--requires=rank:0-1--requires="foo or rank:0-1"- `--requires="-host:badhost"
- `--requires="foo or (not rank:7)"
etc.
If this all seems acceptable, then probably RFC 20 should be updated to add : to the list of invalid characters in property names. Then we can update to RFC 31 to require the list of operators supported here.
Works for me!
This looks pretty good to me too. How do we interpret mixed commas and operators though, if we already allow foo,bar, do we need to allow foo,host:corona296 and not_on_fire? I ask mainly so we can figure out what the placement and priority would be on that, maybe we could make , act like and, or deprecate the comma so it's consistent?
Great points, I hadn't thought of that.
What I was thinking is that tokens are simply separated by whitespace, and, or, and not are keywords, otherwise you have an [operator:]args where if operator is left off then this implies the properties operator. Then properties:foo,bar,baz turns into {"properties": ["foo", "bar", "baz"]} in the RFC 31 constraint object. This allows host:fluke[1,3,5], rank:0-5,10, etc.
Thus foo,host:corona296 would be treated as a token and fail compilation because there is no supported operator foo,host?
This seems simplest to me, though I'm not sure if it is consistent as you say.
I like that, I think it's just as consistent as the other option, it makes , just part of the token which seems fine. Would we also want to support the square brackets? I have a feeling we'd want to support quoting a token for some things too, though that could start to get gnarly, maybe just not supporting spaces for now is reasonable. I think that would leave us with roughly:
ident=/(\w=,]\[)+ ?/
operator="or "|"and "|"not "|"^"
prop= /\w+:/ ident
token=operator | prop
expr=token+
Nice summary, I was currently using - as the shorthand to negate an entire operator expression. For good or bad ^ is currently used to negate a single property --requires=foo,^bar :shrug:
Ah, I mixed up the RFC and the syntax, here's an update. I'm actually thinking it makes sense to make that property-only, otherwise it reads strangely to me. If it were ! then I would assume it would negate an entire expression, but - interprets as negating a single term if I read it without thinking about it.
ident=/[\w-](\w-=]\[)*/
idents=ident ("," ident)*
operator="or"|"and"|"not"
prop="-"? ident
props=prop ("," prop)*
properties="properties:"? props
term= /\w+:/ idents
token=operator | properties | term
expr=token (" " token)*
We wouldn't actually need the properties split out in the grammar since they'd be valid terms with the leading "-", but we should decide if we want to allow embedded dashes or not. This has the advantage it would be really easy to make a small parser for, especially with a little PEG lib or something which would be good for lots of other things.
I was hoping to make exclusion of a set of hosts or ranks simple (akin to gmail or google search syntax) with e.g. --requires=hosts:fluke[1,3-5] to constrain to a hostlist or --requires=-hosts:fluke[1,3-5] to exclude the same hosts, rather than requiring the user to type --requires="not hosts:fluke[1,3-5]". However, if the former is confusing then the extra required quoting and 3 characters is not a dealbreaker for me.
We could always make it support --requires="hosts:-fluke[1,3-5]".
Some shells will need the quoting to tolerate the brackets due to glob
syntax, then we can support multiple values passed to each term either
negated or not. That makes it so we don't have to deal with things like
--requires="on_fire and -(hosts:...[1,5] or has_gpu)", which is where
the negation really gets me. I could probably be convinced, but
allowing it on the whole thing rather than the components feels odd.
You make a good point. Let's keep it simple for now. I like your suggestion of negating the hostlist not the entire term.
@trws, thinking about this a little more:
hosts:fluke[1,3-5] turns into {"hostlist": [ "fluke[1,3-5]" ]}, so is it a bit weird that hosts:-fluke[1,3-5] compiles to {"not": [{"hostlist": [ "fluke[1,3-5]" ]}]}?
When I started looking at changing the current implementation it just made more sense to have -operator wrap the operator and its args in not, rather than having to parse the argument for a leading - and then wrap the operator in not.
In fact, this could get confusing in some situations where you have ranks:-0,5 does this mean ranks:5 and not rank:0 or not rank:0 and not rank:5? Whereas with -ranks:0,5 this clearly means the latter, and if you want the former, you can do either -rank:0 and rank:5 or the expanded statement above.
--requires="on_fire and -(hosts:...[1,5] or has_gpu)", which is where the negation really gets me
What if we just don't support negation in front of left parens? I.e. the - can only be used in an operator name when being used as operator:args to indicate the negation of that operator, and all we do internally is act as if -operator:args is not operator:args?
Does that make any sense, or am I off in the weeds again?
Updated to use the constraint parser from #4871 in flux mini --requires. Also took a stab at describing the enhanced supported constraint syntax in flux-mini(1).
This is now based on top of #4871.
Codecov Report
Merging #4677 (68fa449) into master (1ff34c7) will increase coverage by
0.02%. The diff coverage is92.15%.
@@ Coverage Diff @@
## master #4677 +/- ##
==========================================
+ Coverage 82.86% 82.89% +0.02%
==========================================
Files 425 426 +1
Lines 74814 75143 +329
==========================================
+ Hits 61996 62290 +294
- Misses 12818 12853 +35
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/common/librlist/match.c | 90.00% <89.50%> (-8.86%) |
:arrow_down: |
| src/bindings/python/flux/constraint/parser.py | 93.88% <93.88%> (ø) |
|
| ...python/flux/job/frobnicator/plugins/constraints.py | 97.50% <100.00%> (+0.20%) |
:arrow_up: |
| src/cmd/flux-mini.py | 94.32% <100.00%> (+0.05%) |
:arrow_up: |
| src/common/librlist/rlist.c | 84.79% <100.00%> (+0.04%) |
:arrow_up: |
| src/common/libsubprocess/fork.c | 75.38% <0.00%> (-1.54%) |
:arrow_down: |
| src/broker/module.c | 75.05% <0.00%> (-0.42%) |
:arrow_down: |
| src/broker/overlay.c | 84.15% <0.00%> (-0.40%) |
:arrow_down: |
| src/bindings/python/flux/job/Jobspec.py | 90.08% <0.00%> (+0.21%) |
:arrow_up: |
| ... and 2 more |
Closing this PR because the original description is a bit outdated, and I think it may be good to split this into two separate PRs:
- support hostlist and rank operators in librlist/sched-simple
- add rfc35 parser to
flux mini --requires