codeflare-sdk
codeflare-sdk copied to clipboard
Split head memory and cpu requests/limits
Issue link
Closes: RHOAIENG-9259
What changes have been made
Split the head cpu and memory resources to requests/limits similar to #547
Added depreciation warnings to the old vars head_cpus and head_memory
Verification steps
Setup
Notebook server ODH/RHOAI/Local
- Clone this repository with
git clone https://github.com/project-codeflare/codeflare-sdk.git - Checkout this PR's branch
- Run
poetry build- install if needed (pip install poetry) - Run
pip install --force-reinstall dist/codeflare_sdk-0.0.0.dev0-py3-none-any.whl - Restart your notebook kernel
Testing
Testing the depreciating args head_cpus and head_memory
Follow through the basic Ray demo. Set the head_cpus and head_memory parameters to a value of your choosing.
You should get a warning that the parameters are being depreciated and to use the new ones.
The head cpu requests and limits should both equate the values you entered for the above.
Testing the new requests/limits args
In the ClusterConfiguration add the parameters
- head_cpu_requests
- head_cpu_limits
- head_memory_requests
- head_memory_limits
Set them to values of your choosing and the head pod of the Ray Cluster should reflect these values.
Checks
- [x] I've made sure the tests are passing.
- Testing Strategy
- [x] Unit tests
- [x] Manual tests
- [ ] Testing is not required for this change
@ChristianZaccaria This is not expected behaviour at all :( I can have a look at adding some validation to ensure that the head/worker requests/limits are of the correct type. Good catch!
@Bobbins228 I couldn't get further, but I suppose maybe cluster.up() will already capture that and throw an error for using the wrong datatypes. However, you're right, there seems to be no validation when creating the yaml file.
@ChristianZaccaria This is insane! It seems you can pretty much set any of the variables to whatever type you like. I will create a Jira for fixing the validation on all ClusterConfiguration parameters.
Applied do not merge label until RHOAIENG-9259 is a priority again.
/retest
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: KPostOffice
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [KPostOffice]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment