yunikorn-core
yunikorn-core copied to clipboard
[YUNIKORN-790] Implement MaxApplications enforcement
What is this PR for?
Adding maxApplication implementation for Yunikorn queue. This PR includes and updates existing wip https://github.com/apache/yunikorn-core/pull/384
What type of PR is it?
- [ ] - Bug Fix
- [ ] - Improvement
- [x] - Feature
- [ ] - Documentation
- [ ] - Hot Fix
- [ ] - Refactoring
Todos
NA
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-790
How should this be tested?
- local test passed by running "make test" in Yunikorn-core and Yunikorn-shim repos(refer to screenshot)
2.Local test result: (launched 1000 Spark jobs to 5 Yunikorn queues on k8s cluster which has 85 nodes)
Total duration and Performance looks good.
Each Yunikorn queue set maxApplication to 12, I queried running jobs number every minute, didn't exceed the limit (12*5 queues= 60).
load test result@08_30.pdf
Questions:
NA
I am not sure which PR is the one you are working on. The email pointed back to this PR: config validation code for the resources of a queue compared to its parent is here https://github.com/apache/yunikorn-core/blob/master/pkg/common/configs/configvalidator.go#L79
I am not sure which PR is the one you are working on. The email pointed back to this PR: config validation code for the resources of a queue compared to its parent is here https://github.com/apache/yunikorn-core/blob/master/pkg/common/configs/configvalidator.go#L79
I merged two PRs together. Please only review this one.
updated, please review.
I have some concerns regarding some pieces of the code, please take a look at those, thanks.
Thanks for reviewing the code. I have updated them and addressed all your comments. Please take another look.
Hi @manirajv06 Can you take a look? I saw @wilfred-s requested a review from you 11 days ago.
Thanks @craigcondit for reviewing the PR. I addressed your comments about combining the methods. canRun code is outdate, please review incAllocatingAcceptedAppsIfCanRun method
Thank you @craigcondit. I addressed comments. @wilfred-s @pbacsko please take a look when you have time, thanks!
@lixmgl / @pbacsko / @manirajv06 / @craigcondit Please check PR #455. I ported the changes to master and moved the logic into the queue code. I added unit tests and one full cycle test into the partition. I think I have covered all the cases.
@lixmgl / @pbacsko / @manirajv06 / @craigcondit Please check PR #455. I ported the changes to master and moved the logic into the queue code. I added unit tests and one full cycle test into the partition. I think I have covered all the cases.
PR https://github.com/apache/yunikorn-core/pull/455 LGTM. Thanks for the refactoring and full cycle tests @wilfred-s ! Happy late Thanksgiving :) Can we merge this PR to master first then add your PR?
Can we merge this PR to master first then add your PR?
This PR formed the basis under my port. We can merge the ported PR in one go. @craigcondit will handle the proper attribution to you when we commit the change. I just did some cleanup and the port :-)
Can we merge this PR to master first then add your PR?
This PR formed the basis under my port. We can merge the ported PR in one go. @craigcondit will handle the proper attribution to you when we commit the change. I just did some cleanup and the port :-)
I see, sounds good. Thanks @wilfred-s and @craigcondit !
Closed in favor of #455.