cluster-api-provider-gcp
cluster-api-provider-gcp copied to clipboard
Add GPU/Accelerator support to VMs in GCPMachineTemplate
What type of PR is this?
/kind feature
What this PR does / why we need it: Adds the ability to configure Guest Accelerators like GPUs in a GCPMachineTemplate Fixes #289
Special notes for your reviewer: Tested and creates machines with GPUs correctly. After installing drivers and nvidia container runtime on the node, was able to get the GPU to run successfully in a Pod. If you try to use an accelerator on the wrong instance type it will have an instance reconcile error from GCP that describes the improper API use.
OnHostMaintenance must be set to TERMINATE for GPU enabled machines.
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance#guest_accelerator
Confirmed this is correct. Instance reconcile is rejected by GCP otherwise.
I set this field automatically.
TODOs:
- [x] squashed commits
- [x] includes documentation
- [ ] adds unit tests
Release note:
Add GPU/Accelerator support for VMs in GCPMachineTemplate
Hi @jwmay2012. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Deploy Preview for kubernetes-sigs-cluster-api-gcp ready!
| Name | Link |
|---|---|
| Latest commit | aed6b4c3656aeb4c5a95585fcb4dfefa56e123ea |
| Latest deploy log | https://app.netlify.com/projects/kubernetes-sigs-cluster-api-gcp/deploys/68d2c82892b2350008155960 |
| Deploy Preview | https://deploy-preview-1341--kubernetes-sigs-cluster-api-gcp.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Thanks @jwmay2012
/ok-to-test
@jwmay2012 - would you be able to run make lint on this change?
Could you please provide an estimate of when this change might be included in a release?
Thanks @jwmay2012.
/lgtm
We good to merge? Been running a custom CAPG with these changes for a while and would love to get this upstream :)
@richardcase are you happy with this? If so would you be able to stamp your approval on it? Thanks!
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: cpanato, jwmay2012
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [cpanato]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
bump
/cc @elmiko
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
bump
/remove-lifecycle rotten
@dims @richardcase help
@jwmay2012 could you please rebase? Thanks!
Or @reyvonger