yunikorn-core icon indicating copy to clipboard operation
yunikorn-core copied to clipboard

[YUNIKORN-790] Implement MaxApplications enforcement

Open lixmgl opened this issue 2 years ago • 4 comments

What is this PR for?

Adding maxApplication implementation for Yunikorn queue. This PR includes and updates existing wip https://github.com/apache/yunikorn-core/pull/384

What type of PR is it?

  • [ ] - Bug Fix
  • [ ] - Improvement
  • [x] - Feature
  • [ ] - Documentation
  • [ ] - Hot Fix
  • [ ] - Refactoring

Todos

NA

What is the Jira issue?

https://issues.apache.org/jira/browse/YUNIKORN-790

How should this be tested?

  1. local test passed by running "make test" in Yunikorn-core and Yunikorn-shim repos(refer to screenshot)

Screen Shot 2022-09-01 at 10 30 33 PM Screen Shot 2022-09-01 at 10 32 28 PM

2.Local test result: (launched 1000 Spark jobs to 5 Yunikorn queues on k8s cluster which has 85 nodes) Screen Shot 2022-08-30 at 10 27 58 PM

Total duration and Performance looks good. Each Yunikorn queue set maxApplication to 12, I queried running jobs number every minute, didn't exceed the limit (12*5 queues= 60).
load test result@08_30.pdf

Questions:

NA

lixmgl avatar Aug 11 '22 22:08 lixmgl

I am not sure which PR is the one you are working on. The email pointed back to this PR: config validation code for the resources of a queue compared to its parent is here https://github.com/apache/yunikorn-core/blob/master/pkg/common/configs/configvalidator.go#L79

wilfred-s avatar Aug 25 '22 07:08 wilfred-s

I am not sure which PR is the one you are working on. The email pointed back to this PR: config validation code for the resources of a queue compared to its parent is here https://github.com/apache/yunikorn-core/blob/master/pkg/common/configs/configvalidator.go#L79

I merged two PRs together. Please only review this one.

lixmgl avatar Aug 31 '22 19:08 lixmgl

updated, please review.

lixmgl avatar Aug 31 '22 19:08 lixmgl

I have some concerns regarding some pieces of the code, please take a look at those, thanks.

Thanks for reviewing the code. I have updated them and addressed all your comments. Please take another look.

lixmgl avatar Sep 04 '22 19:09 lixmgl

Hi @manirajv06 Can you take a look? I saw @wilfred-s requested a review from you 11 days ago.

lixmgl avatar Nov 15 '22 07:11 lixmgl

Thanks @craigcondit for reviewing the PR. I addressed your comments about combining the methods. canRun code is outdate, please review incAllocatingAcceptedAppsIfCanRun method

lixmgl avatar Nov 19 '22 00:11 lixmgl

Thank you @craigcondit. I addressed comments. @wilfred-s @pbacsko please take a look when you have time, thanks!

lixmgl avatar Nov 21 '22 21:11 lixmgl

@lixmgl / @pbacsko / @manirajv06 / @craigcondit Please check PR #455. I ported the changes to master and moved the logic into the queue code. I added unit tests and one full cycle test into the partition. I think I have covered all the cases.

wilfred-s avatar Nov 24 '22 07:11 wilfred-s

@lixmgl / @pbacsko / @manirajv06 / @craigcondit Please check PR #455. I ported the changes to master and moved the logic into the queue code. I added unit tests and one full cycle test into the partition. I think I have covered all the cases.

PR https://github.com/apache/yunikorn-core/pull/455 LGTM. Thanks for the refactoring and full cycle tests @wilfred-s ! Happy late Thanksgiving :) Can we merge this PR to master first then add your PR?

lixmgl avatar Nov 28 '22 18:11 lixmgl

Can we merge this PR to master first then add your PR?

This PR formed the basis under my port. We can merge the ported PR in one go. @craigcondit will handle the proper attribution to you when we commit the change. I just did some cleanup and the port :-)

wilfred-s avatar Nov 29 '22 10:11 wilfred-s

Can we merge this PR to master first then add your PR?

This PR formed the basis under my port. We can merge the ported PR in one go. @craigcondit will handle the proper attribution to you when we commit the change. I just did some cleanup and the port :-)

I see, sounds good. Thanks @wilfred-s and @craigcondit !

lixmgl avatar Nov 29 '22 16:11 lixmgl

Closed in favor of #455.

craigcondit avatar Nov 29 '22 18:11 craigcondit