gitpod icon indicating copy to clipboard operation
gitpod copied to clipboard

[research] Test GPU support in workspaces

Open csweichel opened this issue 3 years ago β€’ 18 comments

Is your feature request related to a problem? Please describe

Kubernetes can make GPUs available in pods. Can those GPUs be used from within a Gitpod workspace?

Describe the behaviour you'd like

GPUs should be possible to use if the underlying cluster supports it.

How

Can we run Gitpod workspaces with GPU? Try changing the config map in-place for ws-manager and workspace-templates first, before any actual code changes. The expectation is that if code changes are needed, they'll be throw away, and not actually merged back tomain.

Intended output

Note: this is just a research task. We want to know where we stand today for scheduling workloads that need GPU, and how well Gitpod runs on them.

Questions

  1. Does Gitpod work on a node supporting GPUs? How well does it work?
  2. Can we use all of the cores of the GPU?

csweichel avatar Feb 22 '22 17:02 csweichel

@lucasvaltl I've scheduled this work, as it'll require a code change for Workspace components.

kylos101 avatar Mar 02 '22 16:03 kylos101

@lucasvaltl @corneliusludmann I've removed Team Workspace from this issue, and added to the self-hosted inbox.

kylos101 avatar Mar 03 '22 14:03 kylos101

Assigned @metcalfc for know as he agreed on setting up a cluster for testing, if I am informed correctly. Please assign me to this issue when it's done. :pray:

corneliusludmann avatar Mar 03 '22 15:03 corneliusludmann

@corneliusludmann I put a branch in the EKS guide that sets up a GPU nodegroup. So there are a couple of challenges. We don't have a custom AMI with all the nvidia support so I had to use the default which is AL2 so stick to fuse because shiftfs doesn't seem to work. Also I had to remove our bootstrap script so the nodes won't have the labels on them. But it should get things started with GPUs.

metcalfc avatar Mar 04 '22 17:03 metcalfc

It would be quite awesome to have GPU support. I'm looking at a convenient way to spin-up a workspace that would allow me to play with cupy which requires Cuda and thus a Nvidia GPU. While one can either buy a dedicated machine or configure a dedicated VM for that, it is an expensive and time consuming investment if you are not planning to use it systematically.

KelSolaar avatar May 29 '22 05:05 KelSolaar

I would like to try this in a self-hosted environment, but I am unable to find any documentation on workspace-templates. How should I go about adding the nvidia.com/gpu entry to the resources.requests list of the workspace pod?

sigurdkb avatar May 31 '22 06:05 sigurdkb

I would like to try this in a self-hosted environment, but I am unable to find any documentation on workspace-templates. How should I go about adding the nvidia.com/gpu entry to the resources.requests list of the workspace pod?

Any thoughts on this?

sigurdkb avatar Jun 22 '22 13:06 sigurdkb

:wave: @sigurdkb , we plan to make it available in the saas, but, as you can see, there are still some open questions and we don't have an estimate yet (as to which quarter we can release it).

Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in this issue.

If you're still interested in using for self-hosted, let @atduarte know? I'm sure he'd be interested to create a separate issue (similar to the saas one I shared above).

kylos101 avatar Jul 05 '22 20:07 kylos101

@KelSolaar Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in https://github.com/gitpod-io/gitpod/issues/10650.

kylos101 avatar Jul 05 '22 20:07 kylos101

πŸ‘‹ @sigurdkb , we plan to make it available in the saas, but, as you can see, there are still some open questions and we don't have an estimate yet (as to which quarter we can release it).

Would you be interested in using GPU via our saas offering? If yes, please reach out to Andre via the calendly link he shared in this issue.

If you're still interested in using for self-hosted, let @atduarte know? I'm sure he'd be interested to create a separate issue (similar to the saas one I shared above).

Saas is not a viable option for us. @atduarte, I'm still very interested in getting this to work for self-hosted πŸ‘

sigurdkb avatar Jul 06 '22 05:07 sigurdkb

@sigurdkb we are too :) SaaS comes first as it is the best way we have to learn and experiment ourselves, and then help others bring the same experience to Gitpod Self-Hosted.

It would be very valuable to me to better understand your needs. If you are willing, here’s my calendly link: https://calendly.com/andre-gitpod/15-minute-product-feedback

atduarte avatar Jul 07 '22 08:07 atduarte

We would also be very much interested in such a feature for self-hosted β€” not particularly GPU support but device plugin support in general. We use the smarter device manager as a very simple way to allow access to /dev/kvm within our cluster. With this, all we need to do is to add the smarter/kvm resource to the pod.

For the Gitlab runner we built a custom MutatingAdmissionController to inject the resource based on annotations the runner sets. We could go this path for Gitpod as well (I'm actually already creating a PoC) β€” but due to the lack of the ability to set custom annotations or labels for certain workspace pods it will be an all-or-nothing solution. Therefore, some integration for custom resources (or at least custom annotations or labels) would be appreciated.

janLo avatar Sep 12 '22 17:09 janLo

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 16 '22 06:12 stale[bot]

Any updates on GPU support? Given the models I'd like to run, 8-16GB of RAM is a starting point for inferencing. Thanks!

ccfarah avatar Apr 15 '23 16:04 ccfarah

I'm not entirely sure how this could be integrated into workspace images. However, it appears that the Nvidia Tesla V100 can be pre-built into a Docker image (dead link). Has anyone tried this approach?

bitsnaps avatar May 31 '23 08:05 bitsnaps

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 16 '23 21:09 stale[bot]

Any news ?

julien-blanchon avatar Oct 13 '23 13:10 julien-blanchon

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar May 22 '24 15:05 github-actions[bot]