flux-sched icon indicating copy to clipboard operation
flux-sched copied to clipboard

policy: refactor options to allow custom

Open wihobbs opened this issue 1 year ago • 5 comments

Problem: the current fluxion policy matching factory doesn't allow for customizable policies.

Add a custom option and refactor existing policies to behave similarly. Add unit testing for the string parsing convenience functions.

Outstanding to-do:

  • [ ] add an option in the scheduler config to actually pass a custom string in, and probably some validation too

wihobbs avatar Aug 13 '24 01:08 wihobbs

And I know the testing and the code should be in separate commits... sigh (will fix)

wihobbs avatar Aug 13 '24 01:08 wihobbs

PR title seems truncated?

garlick avatar Aug 20 '24 15:08 garlick

Well, custom is a new type of policy that this PR is enabling. Or will enable eventually. Maybe "resource: refactor policy options to allow a user-defined policy" is more descriptive?

Note that this is still WIP and needs some config work and testing for the custom options. Happy to incorporate any feedback on the initial draft (and thanks to those who have already reviewed!), but it's going to stay in draft state until the code can:

  • [x] accept a custom string with policy options
  • [x] reject certain strings that are requesting invalid custom options, such as set_stop_on_k_matches=-100 and such
  • [x] test some custom options and make sure they behave as expected
  • [x] break out the commit history in a better way

wihobbs avatar Aug 20 '24 17:08 wihobbs

Per discussion with Tom:

  • [x] To handle custom policies, "don't validate, parse" -- parse the input on a delimiter first and then verify that each key is valid, give an error message if it isn't, possibly demote this to a warning based on a config option
  • [x] Fluxion should issue a fatal warning if a policy named is not valid -- don't fall back on first (might already be done?)
  • [ ] set_stop_on_k_matches probably only takes a value of 1, however, we should test this
  • [ ] ~~we should also think about having "node_centric" have a score factor of 10000, maybe we don't want to allow users to configure this, but maybe that number doesn't cover all our bases~~

wihobbs avatar Sep 05 '24 04:09 wihobbs

Don't forget to address @garlick's comment about the PR title, which will end up as a line in the release notes.

grondo avatar Jan 16 '25 19:01 grondo

I just asked @jameshcorbett to do a first pass to get me started. Note this still is in draft state until it has working sharness tests and additional reviews.

wihobbs avatar Jan 16 '25 19:01 wihobbs

Once the automated tests pass I'll drop draft from this PR.

wihobbs avatar Jan 22 '25 21:01 wihobbs

This is starting to look really good @wihobbs! There's one tweak I'd like to see to this so we can keep extending it. I like the way the options are handled, this is exactly what I was hoping for. The one tweak I'd make is to have a policy=* option rather than separate low= and high=, since the base policy object types are low, high, locality and variation. That way, we get the "you can't set both low and high at the same time" for free, and can even collapse the two low and high branches to create a matcher based on which it is then set options on the resulting object after the condition. We can also check that policy has been set, either with the bare name or by having policy=whatever to allow options. Does that sound workable?

trws avatar Jan 23 '25 20:01 trws

I think I've incorporated this feedback @trws, though slightly differently than you laid out in your comment. Instead of a policy key which accepts locality, variation, high, and low, I added a policy key that will accept low or high, and you can still set match-policy to locality, variation, high, and low, in which case the latter two would result in a "vanilla" high or low policy without the extra options. Does that suffice for what you laid out? It turned out to be not a huge lift.

I haven't implemented allowing set_stop_on_k_matches to accept an integer yet, but I'm going to go ahead and request a few reviews on this, to get feedback on the code and start getting the commits polished for approval. If you're okay with kicking that down the road to another PR (apologies, I forgot what we decided on that in our conversation yesterday), we could do that when the option itself supports integers other than 1.

wihobbs avatar Jan 31 '25 01:01 wihobbs

Thanks @cmoussa1! Fixed and pushed.

wihobbs avatar Jan 31 '25 18:01 wihobbs

Thanks @trws!

wihobbs avatar Feb 04 '25 02:02 wihobbs

Codecov Report

Attention: Patch coverage is 90.83333% with 11 lines in your changes missing coverage. Please review.

Project coverage is 75.4%. Comparing base (471da50) to head (6c04cac). Report is 93 commits behind head on master.

Files with missing lines Patch % Lines
resource/policies/dfu_match_policy_factory.cpp 84.2% 11 Missing :warning:
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #1268   +/-   ##
======================================
  Coverage    75.3%   75.4%           
======================================
  Files         111     112    +1     
  Lines       16042   16123   +81     
======================================
+ Hits        12081   12158   +77     
- Misses       3961    3965    +4     
Files with missing lines Coverage Δ
resource/modules/resource_match.cpp 69.8% <100.0%> (+0.2%) :arrow_up:
resource/modules/resource_match_opts.cpp 84.5% <100.0%> (+0.4%) :arrow_up:
resource/modules/resource_match_opts.hpp 100.0% <ø> (ø)
...licies/base/test/matcher_policy_factory_test02.cpp 100.0% <100.0%> (ø)
resource/policies/dfu_match_policy_factory.cpp 82.3% <84.2%> (-6.3%) :arrow_down:

... and 1 file with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 05 '25 02:05 codecov[bot]