yunikorn-k8shim icon indicating copy to clipboard operation
yunikorn-k8shim copied to clipboard

[YUNIKORN-2230] Placement rule does not behave as expected

Open brandboat opened this issue 1 year ago • 7 comments

What is this PR for?

Fix issue with the placement rules, specifically the one where the provided rule (when created is true) is applied before the tag rule, is not functioning as intended. The root cause lies in the GetQueueNameFromPod function in the shim, which automatically adds a default queue name to a pod if none is specified. As a result, the provided rule (when created is true) is consistently applied, preventing the tag rule from being triggered.

What type of PR is it?

  • [x] - Bug Fix

Todos

N/A

What is the Jira issue?

https://issues.apache.org/jira/browse/YUNIKORN-2230

How should this be tested?

covered by e2e test

Screenshots (if appropriate)

N/A

Questions:

  • [x] - There is breaking changes for older versions. This Pull Request removed the default queue name on the shim side. If a user is using YuniKorn without enabling the Admission Controller (which adds a default queue name), pods that do not provide a queue name will be rejected.

brandboat avatar Dec 08 '23 11:12 brandboat

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (dc98af0) 69.46% compared to head (eafc61b) 69.45%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #748      +/-   ##
==========================================
- Coverage   69.46%   69.45%   -0.01%     
==========================================
  Files          50       50              
  Lines        7993     7992       -1     
==========================================
- Hits         5552     5551       -1     
  Misses       2252     2252              
  Partials      189      189              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Dec 08 '23 11:12 codecov[bot]

This is more a design and behavioural question than a code review. Should we not have a similar behaviour as we have on the admission controller and allow specifying the default queue? In other words should we re-use the setting for the default queue in the K8shim. We do something similar also for the application ID generation.

wilfred-s avatar Dec 12 '23 03:12 wilfred-s

This is more a design and behavioural question than a code review. Should we not have a similar behaviour as we have on the admission controller and allow specifying the default queue? In other words should we re-use the setting for the default queue in the K8shim. We do something similar also for the application ID generation.

Thank's for the comment! Indeed, that is a better way and doing so won't break anything. I'll make the setting admissionController.filtering.defaultQueue to also determine whether K8shim should add the default queue name or not. And we should emphasize in document that this setting still takes effect even without the admission controller, I'll open another pr to do that after this one is merged.

brandboat avatar Dec 12 '23 06:12 brandboat

I also have another minor question: should we consider renaming these two settings (admissionController.filtering.defaultQueueName and admissionController.filtering.generateUniqueAppId)? Since they not only control the admission controller but also influence K8Shim, perhaps a more descriptive name would be appropriate.

brandboat avatar Dec 12 '23 06:12 brandboat

gentle ping @wilfred-s I'm trying to follow the logic of GenerateUniqueAppIds and add a new variable DefaultQueueName in schedulerconf.go. However, I encountered a challenge. I believe DefaultQueueName must be placed in handleNonReloadableConfig to make it a configuration that takes effect only after restarting YuniKorn. Without this, changes to DefaultQueueName might lead to inconsistencies when getAppMetadata returns the queueName in the existing pod, as it could change due to configmap modifications in DefaultQueueName.

However, if I set DefaultQueueName as non-reloadable, changing the default queue name to "" would require restarting the entire YuniKorn. Otherwise, the K8shim might still return the queue name as root.default (default value of DefaultQueueName). I think this could make things more complex, not unsure if there are better suggestions or approaches... thank you.

brandboat avatar Dec 15 '23 15:12 brandboat

if I set DefaultQueueName as non-reloadable, changing the default queue name to "" would require restarting the entire YuniKorn

pardon me, is this a potential issue? Are there any reasons that users have to change the default queue name by re-reconfiguring?

chia7712 avatar Dec 15 '23 16:12 chia7712

The admission controller handles it as a reloadable value. We can do either. However I think that the value should be reloadable as we can change the queue config and placement rules on the fly also.

If you do not allow it to be reloadable a queue config change could be applied by the default queue would not change causing rejects etc. Keeping them both reloadable would be better.

wilfred-s avatar Apr 22 '24 04:04 wilfred-s

Obsolete after YUNIKORN-2711 and YUNIKORN-2703.

craigcondit avatar Jul 11 '24 21:07 craigcondit