pai
pai copied to clipboard
Default k8s scheduler support
Organization Name: Advantech
Short summary about the issue/question: Does the default k8s scheduler support in opnepai v1.5.0? How to run CPU and GPU tasks on single GPU worker?(E.g. https://github.com/microsoft/pai/issues/5044)
Brief what process you are following: In v1.0.1, we can use the k8s default scheduler based on https://github.com/microsoft/pai/issues/5044#issuecomment-720410187. When I change the k8s default scheduler, the SKU based scheduling seems incorrect. Does the default k8s scheduler support in opnepai v1.5.0 or in the future release?
How to reproduce it:
- Deploy openpai v1.5.0
- Change the scheduler as follows
hivedscheduler:
config: |
- Apply the configuration
./paictl.py service stop -n rest-server hivedscheduler
./paictl.py config push -p
-m service ./paictl.py service start -n hivedscheduler rest-server - The SKU on the job submission seems incorrect
OpenPAI Environment:
- OpenPAI version: v1.5.0
- OS (e.g. from /etc/os-release): Ubuntu 18.04.3 LTS
Not sure if we still support default scheduler. @abuccts to help
@JosephKang , could you describe the detailed scheduling behavior? We didn't test the default scheduler for a while.
The default scheduler is used to set the job resource on demand instead of SKU unit allocation, and it might be achieve the maximum utilization of the worker node.
The following scenarios might be a good example for one worker with 1GPU/9 CPU resource. Please let me know if my understanding is incorrect Scenario a. One 1GPU/4CPU task and one 4CPU task at the same time Scenario b. Two 4 CPU tasks at the same time.
It seems you are asking if webportal is allowed to assign a fraction of resource other than the defined SKU? And yes, we prefer users to use the resource in the granularity of sku to avoid unnecessary fragmentation (so in webportal you cannot set resource other than sku). If you want more fine-grained resource usage, you can specify the resource usage through OpenPAI SDK.
We hope to have more fine-grained resource usage. It seems that the assign task pod can be set based on API parameters instead of SKU unit, but the available resource deduction seems to be based on the granularity of SKU.
E.g.
Total resource = 2GPU, 8CPU and 50G RAM
SKU = 1GPU/4CPU/25G RAM,
Request API = 2CPU/ 20G RAM
Reminding available = 1GPU/4CPU/25G RAM (1 SKU left)
Is it also a preferred behavior in order to sync sku?