gitpod [research] Test GPU support in workspaces

Is your feature request related to a problem? Please describe

Kubernetes can make GPUs available in pods. Can those GPUs be used from within a Gitpod workspace?

Describe the behaviour you'd like

GPUs should be possible to use if the underlying cluster supports it.

How

Can we run Gitpod workspaces with GPU? Try changing the config map in-place for ws-manager and workspace-templates first, before any actual code changes. The expectation is that if code changes are needed, they'll be throw away, and not actually merged back tomain.

Intended output

Note: this is just a research task. We want to know where we stand today for scheduling workloads that need GPU, and how well Gitpod runs on them.

Questions

Does Gitpod work on a node supporting GPUs? How well does it work?
Can we use all of the cores of the GPU?

Feb 22 '22 17:02 csweichel

@lucasvaltl I've scheduled this work, as it'll require a code change for Workspace components.

Mar 02 '22 16:03 kylos101

@lucasvaltl @corneliusludmann I've removed Team Workspace from this issue, and added to the self-hosted inbox.

Mar 03 '22 14:03 kylos101

Assigned @metcalfc for know as he agreed on setting up a cluster for testing, if I am informed correctly. Please assign me to this issue when it's done. :pray:

Mar 03 '22 15:03 corneliusludmann

@corneliusludmann I put a branch in the EKS guide that sets up a GPU nodegroup. So there are a couple of challenges. We don't have a custom AMI with all the nvidia support so I had to use the default which is AL2 so stick to fuse because shiftfs doesn't seem to work. Also I had to remove our bootstrap script so the nodes won't have the labels on them. But it should get things started with GPUs.

Mar 04 '22 17:03 metcalfc

It would be quite awesome to have GPU support. I'm looking at a convenient way to spin-up a workspace that would allow me to play with cupy which requires Cuda and thus a Nvidia GPU. While one can either buy a dedicated machine or configure a dedicated VM for that, it is an expensive and time consuming investment if you are not planning to use it systematically.

May 29 '22 05:05 KelSolaar

I would like to try this in a self-hosted environment, but I am unable to find any documentation on workspace-templates. How should I go about adding the nvidia.com/gpu entry to the resources.requests list of the workspace pod?

May 31 '22 06:05 sigurdkb

I would like to try this in a self-hosted environment, but I am unable to find any documentation on workspace-templates. How should I go about adding the nvidia.com/gpu entry to the resources.requests list of the workspace pod?

Any thoughts on this?

Jun 22 '22 13:06 sigurdkb

:wave: @sigurdkb , we plan to make it available in the saas, but, as you can see, there are still some open questions and we don't have an estimate yet (as to which quarter we can release it).

Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in this issue.

If you're still interested in using for self-hosted, let @atduarte know? I'm sure he'd be interested to create a separate issue (similar to the saas one I shared above).

Jul 05 '22 20:07 kylos101

@KelSolaar Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in https://github.com/gitpod-io/gitpod/issues/10650.

Jul 05 '22 20:07 kylos101

👋 @sigurdkb , we plan to make it available in the saas, but, as you can see, there are still some open questions and we don't have an estimate yet (as to which quarter we can release it).

Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in this issue.

If you're still interested in using for self-hosted, let @atduarte know? I'm sure he'd be interested to create a separate issue (similar to the saas one I shared above).

Saas is not a viable option for us. @atduarte, I'm still very interested in getting this to work for self-hosted 👍

Jul 06 '22 05:07 sigurdkb

@sigurdkb we are too :) SaaS comes first as it is the best way we have to learn and experiment ourselves, and then help others bring the same experience to Gitpod Self-Hosted.

It would be very valuable to me to better understand your needs. If you are willing, here’s my calendly link: https://calendly.com/andre-gitpod/15-minute-product-feedback

Jul 07 '22 08:07 atduarte

We would also be very much interested in such a feature for self-hosted — not particularly GPU support but device plugin support in general. We use the smarter device manager as a very simple way to allow access to /dev/kvm within our cluster. With this, all we need to do is to add the smarter/kvm resource to the pod.

For the Gitlab runner we built a custom MutatingAdmissionController to inject the resource based on annotations the runner sets. We could go this path for Gitpod as well (I'm actually already creating a PoC) — but due to the lack of the ability to set custom annotations or labels for certain workspace pods it will be an all-or-nothing solution. Therefore, some integration for custom resources (or at least custom annotations or labels) would be appreciated.

Sep 12 '22 17:09 janLo

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Dec 16 '22 06:12 stale[bot]

Any updates on GPU support? Given the models I'd like to run, 8-16GB of RAM is a starting point for inferencing. Thanks!

Apr 15 '23 16:04 ccfarah

I'm not entirely sure how this could be integrated into workspace images. However, it appears that the Nvidia Tesla V100 can be pre-built into a Docker image (dead link). Has anyone tried this approach?

May 31 '23 08:05 bitsnaps

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sep 16 '23 21:09 stale[bot]

Any news ?

Oct 13 '23 13:10 julien-blanchon

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

May 22 '24 15:05 github-actions[bot]

gitpod gitpod copied to clipboard

[research] Test GPU support in workspaces

Is your feature request related to a problem? Please describe

Describe the behaviour you'd like

How

Intended output

Questions

gitpod
gitpod copied to clipboard