cluster-api-provider-gcp
cluster-api-provider-gcp copied to clipboard
[WIP] GPU support
What type of PR is this? /kind api-change //kind feature
What this PR does / why we need it: In support to accommodate the API adjustments needed to manage GPU acceleration of the GCP instances in CAPG.
Special notes for your reviewer: This is a WIP PR, additional controller changes to be added successively.
TODOs:
- [ ] squashed commits
- [ ] includes documentation
- [ ] adds unit tests
Release note:
NONE
cc @richardcase @dims @cpanato
Hi @SubhasmitaSw. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: SubhasmitaSw
Once this PR has been reviewed and has the lgtm label, please assign dims for approval by writing /assign @dims in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Added the getters for the new GPU API fields so we can get the items and use them in reconciliation. @richardcase do we need setters to set those values in case not provided?
@SubhasmitaSw @aniruddha2000 any status on this?
@cpanato Just a little bit in the documentation is remaining. I and @SubhasmitaSw are currently facing some difficulty understanding the e2e behavior.
@cpanato Just a little bit in the documentation is remaining. I and @SubhasmitaSw are currently facing some difficulty understanding the e2e behavior.
@SubhasmitaSw @aniruddha2000 - did you want to chat about the e2e?
@SubhasmitaSw: PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@SubhasmitaSw: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-cluster-api-provider-gcp-build | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | true | /test pull-cluster-api-provider-gcp-build |
| pull-cluster-api-provider-gcp-apidiff | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | false | /test pull-cluster-api-provider-gcp-apidiff |
| pull-cluster-api-provider-gcp-verify | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | true | /test pull-cluster-api-provider-gcp-verify |
| pull-cluster-api-provider-gcp-test | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | true | /test pull-cluster-api-provider-gcp-test |
| pull-cluster-api-provider-gcp-make | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | true | /test pull-cluster-api-provider-gcp-make |
| pull-cluster-api-provider-gcp-e2e-test | d33fd97e47c96b3c6f7d93f93433f37d6a42dd77 | link | true | /test pull-cluster-api-provider-gcp-e2e-test |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
Any update?
We need to get the images published first. I will check on that first
We need to get the images published first. I will check on that first
let me check that today
We need to get the images published first. I will check on that first
let me check that today
If the gpu param exist, I can test that with Nvidia GPU Operator, if it works flawlessly.
Hmm, merge conflict in go.sum and go.mod
I'll tidy this up!
@SubhasmitaSw Any update?
/retitle [WIP] GPU support
@SubhasmitaSw Any update?
This is currently blocked waiting for the image-builder changes to merge. I will look at unblocking this asap.
Until image builder changes are merged and images are available, adding an explicit:
/hold
thanks for the update @richardcase , i'm adding myself to the subscribe list here as i am taking a look at compatibility with openshift for the changes.
I am picking this up again after the holiday break.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: SubhasmitaSw Once this PR has been reviewed and has the lgtm label, please ask for approval from richardcase. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
@SubhasmitaSw: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-cluster-api-provider-gcp-apidiff | 244060dbd0cabd48576ae28977f64ae656a7382a | link | false | /test pull-cluster-api-provider-gcp-apidiff |
| pull-cluster-api-provider-gcp-test | 244060dbd0cabd48576ae28977f64ae656a7382a | link | true | /test pull-cluster-api-provider-gcp-test |
| pull-cluster-api-provider-gcp-build | 244060dbd0cabd48576ae28977f64ae656a7382a | link | true | /test pull-cluster-api-provider-gcp-build |
| pull-cluster-api-provider-gcp-verify | 244060dbd0cabd48576ae28977f64ae656a7382a | link | true | /test pull-cluster-api-provider-gcp-verify |
| pull-cluster-api-provider-gcp-e2e-test | 244060dbd0cabd48576ae28977f64ae656a7382a | link | true | /test pull-cluster-api-provider-gcp-e2e-test |
| pull-cluster-api-provider-gcp-make | 244060dbd0cabd48576ae28977f64ae656a7382a | link | true | /test pull-cluster-api-provider-gcp-make |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
I will pick up the image building side so that we can get this merged.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Reopen this PR with
/reopen - Mark this PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closed this PR.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closedYou can:
- Reopen this PR with
/reopen- Mark this PR as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@richardcase: Reopened this PR.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Reopen this PR with
/reopen - Mark this PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close