test: add ut for performance test
This MR aims to add a simple performance UT for Volcano.
Two reasons for supplementing UT:
- There are some questions about performance data, such as https://github.com/volcano-sh/volcano/issues/3203.
- Performance-oriented UT helps Volcano make optimizations.
Run this case using the following commands:
cd pkg/scheduler
go test -v --run=TestRunOnc
I have run this case under default configuration without optimization in a machine (56C 256G), and get this result:
I also added a "latencyrecorder" locally to track the execution time of various plugins through instrumentation. However, it's not elegant because it is implemented through code instrumentation, so I haven't submitted it yet.
Can anyone take a look first and give some suggestions? thx~
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
To complete the pull request process, please assign thor-wl
You can assign the PR to them by writing /assign @thor-wl in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
@lowang-bh Could you please spare some time to review my code? thx~
Thanks for your hard works. Can it cover all those schedule logic in the real scheduler? My idea is to use k8s fakeclient to submit jobs/nodes/pods and then triger the scheduler run.
Thanks for your hard works. Can it cover all those schedule logic in the real scheduler? My idea is to use k8s fakeclient to submit jobs/nodes/pods and then triger the scheduler run.
I think it covers most of the schedule logic, except for the part that calls the k8s API(If there are any errors in my statement, please correct me, as I'm still new to volcano.). According to my understanding, the cache is used as middle layer between k8s and Volcano scheduling logic, so I chose to add a mock cache.
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
I think we can keep this pr also. @wangyang0616 @Monokaix
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
I think we can keep this pr also. @wangyang0616 @Monokaix
Thanks~ hope I can help
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
I think we can keep this pr also. @wangyang0616 @Monokaix
I think there are two process that we should concern, the first one is schedule logic in every session, this mainly consists of predict and nodeOrder plugin, the second one is the kube-apiserver throughtput in real cluster, so your test can only test the first part and in my opion, we have no necessary to mock a cache becauese cache is composed of many interfaces, we just need mock these interfaces such as kube client : )
Also, kube-apiserver throughput has a great influence on scheduler performance, so we abstract a bind interface to enable user register their own bind method like bind a lots of pods in one bind request, so you can also work on this to achieve a higher performance.
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature. I think we can keep this pr also. @wangyang0616 @Monokaix
I think there are two process that we should concern, the first one is schedule logic in every session, this mainly consists of predict and nodeOrder plugin, the second one is the kube-apiserver throughtput in real cluster, so your test can only test the first part and in my opion, we have no necessary to mock a cache becauese cache is composed of many interfaces, we just need mock these interfaces such as kube client : )
Also, kube-apiserver throughput has a great influence on scheduler performance, so we abstract a bind interface to enable user register their own bind method like bind a lots of pods in one bind request, so you can also work on this to achieve a higher performance.
@Monokaix Thanks for your review~ You are absolutely right that this unit test does not cover the batch binding part, and indeed, the implementation of the mock kubeclient unit test should be able to fully replace it.
However, with this unit test, Volcano users can now assess the performance of the scheduler's core module and carry out optimization work. If the implementation of the mock kubeclient unit test cannot be completed in the short term (which might involve some code modifications or refactoring), would you be open to considering this unit test in the interim?
Furthermore, the bottleneck in the second part is not only in the binding but also in the performance of Kubernetes, so it may not be entirely solvable with unit testing but might require end-to-end testing.
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature. I think we can keep this pr also. @wangyang0616 @Monokaix
I think there are two process that we should concern, the first one is schedule logic in every session, this mainly consists of predict and nodeOrder plugin, the second one is the kube-apiserver throughtput in real cluster, so your test can only test the first part and in my opion, we have no necessary to mock a cache becauese cache is composed of many interfaces, we just need mock these interfaces such as kube client : ) Also, kube-apiserver throughput has a great influence on scheduler performance, so we abstract a bind interface to enable user register their own bind method like bind a lots of pods in one bind request, so you can also work on this to achieve a higher performance.
@Monokaix Thanks for your review~ You are absolutely right that this unit test does not cover the batch binding part, and indeed, the implementation of the mock kubeclient unit test should be able to fully replace it.
However, with this unit test, Volcano users can now assess the performance of the scheduler's core module and carry out optimization work. If the implementation of the mock kubeclient unit test cannot be completed in the short term (which might involve some code modifications or refactoring), would you be open to considering this unit test in the interim?
Furthermore, the bottleneck in the second part is not only in the binding but also in the performance of Kubernetes, so it may not be entirely solvable with unit testing but might require end-to-end testing.
Yaeh, you can work on this ut and you are welcome, but can we not mock the whole cache and just use fake interface when we need mock. And if this feasible, you can have a try, if it's a little hard, you can try to wrap a Cache and overwrite the methods that hard to mock : )
The reason I don't directly construct a real cache through a fake kube client is because certain parts of the cache implementation use kubeconfig to create a new client (such as vcclient). These scenarios are difficult to use a fake kube client. Do you have any good ideas on how to handle this?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature. I think we can keep this pr also. @wangyang0616 @Monokaix
I think there are two process that we should concern, the first one is schedule logic in every session, this mainly consists of predict and nodeOrder plugin, the second one is the kube-apiserver throughtput in real cluster, so your test can only test the first part and in my opion, we have no necessary to mock a cache becauese cache is composed of many interfaces, we just need mock these interfaces such as kube client : ) Also, kube-apiserver throughput has a great influence on scheduler performance, so we abstract a bind interface to enable user register their own bind method like bind a lots of pods in one bind request, so you can also work on this to achieve a higher performance.
@Monokaix Thanks for your review~ You are absolutely right that this unit test does not cover the batch binding part, and indeed, the implementation of the mock kubeclient unit test should be able to fully replace it. However, with this unit test, Volcano users can now assess the performance of the scheduler's core module and carry out optimization work. If the implementation of the mock kubeclient unit test cannot be completed in the short term (which might involve some code modifications or refactoring), would you be open to considering this unit test in the interim? Furthermore, the bottleneck in the second part is not only in the binding but also in the performance of Kubernetes, so it may not be entirely solvable with unit testing but might require end-to-end testing.
Yaeh, you can work on this ut and you are welcome, but can we not mock the whole cache and just use fake interface when we need mock. And if this feasible, you can have a try, if it's a little hard, you can try to wrap a
Cacheand overwrite the methods that hard to mock : )
OK,I will fix that later.
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
Hi, @bysph , I mocked the SchedulerCache in PR add mock cache for ut and performance test #3269. You can continue your work on this feature, maybe.
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
Hi, @bysph , I mocked the SchedulerCache in PR add mock cache for ut and performance test #3269. You can continue your work on this feature, maybe.
Okay, I will submit a UT based on this. Should I wait for the code to be merged before submitting?
Yes, I have some plan to do it. I will write a demo and then we can work together on that feature.
Hi, @bysph , I mocked the SchedulerCache in PR add mock cache for ut and performance test #3269. You can continue your work on this feature, maybe.
Okay, I will submit a UT based on this. Should I wait for the code to be merged before submitting?
Yes,you can try it after https://github.com/volcano-sh/volcano/pull/3269 merged.
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.