tibanna icon indicating copy to clipboard operation
tibanna copied to clipboard

Add support for nvidia-docker or singularity.

Open medcelerate opened this issue 4 years ago • 8 comments

It would be great to be able to either configure a custom ami to be used where different container runtimes can be defined or be able to run custom scripts to install other utilities.

medcelerate avatar Mar 03 '20 04:03 medcelerate

@medcelerate sorry for getting back late. Custom AMIs are not currently supported, though we may add the support in the long run. Did you specifically need nvidia-docker or singularity? We can definitely look into more specific options.

SooLee avatar Mar 12 '20 21:03 SooLee

nvidia-docker in this case.

medcelerate avatar Mar 16 '20 12:03 medcelerate

Or run non-dockerized tools

medcelerate avatar Mar 16 '20 12:03 medcelerate

@medcelerate running non-dockerized tools is possible by installing tools on the fly (using the shell option of Tibanna), but that would be inside something like an ubuntu docker container. I'm not sure if this could work with GPU-specific tools (is that what you're looking for?) I can certainly look into adding Nvidia-docker support in the next few days.

SooLee avatar Mar 16 '20 22:03 SooLee

Seems like tibanna was updated to allow for configuring the AMI, at least based on the docs:

https://tibanna.readthedocs.io/en/latest/ami.html?highlight=ami#amazon-machine-image

...I think that in principle, I could 'fork' your current AMI and add appropriatte nvidia drivers in order to enable the use of gpus within tibanna jobs, but I've little experience with any of this and wanted to check in before I burn a week trying to figure it out.

And also maybe get this issue closed since it seems to be possible now to set AMI. Though GPU support still seems wanting.

nhartwic avatar Mar 12 '24 23:03 nhartwic

This response is somewhat long winded and not necessarily related to the original issue, but feel free to respond on this one and we can close/create a new one as needed.

Long story short, I think you will find it is not possible to add GPU support to Tibanna without fundamentally changing how it works. Full disclosure though my knowledge of GPU integration within the Cloud is somewhat limited, but I do believe it is analogous to any other EC2 style instance (AWS recommends https://aws.amazon.com/ec2/instance-types/g5/), meaning whatever you put on it must run on the GPU natively.

So you could add NVIDIA drivers to our AMI, but it wouldn't do you any good because you cannot attach GPUs to standard EC2 instances (I don't think, given the service that seems to implement this is EOL https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/elastic-graphics.html). You also wouldn't be able to launch the AMI on a GPU instance at all because of differing underlying architecture. If you're aware of a way to attach GPUs to AWS Cloud Instances, then you can disregard what I'm saying and let us know what you think can be done and we may consider it. Otherwise what follows I think is a significant undertaking we just can't justify right now, as standard instances are powerful enough for us.

There may be a path to accomplish this - but it will be very complex. Probably you'd need to update the AMI selection code to pick a custom GPU based AMI and provide your job to Tibanna as a shell command. Right now Tibanna uses CWL typically + Docker to do this, not sure what would be cleanest in the GPU context. But roughly speaking if you wanted to attempt it I'd follow the below steps:

  1. Create a GPU compatible AMI
  2. Debug until you can successfully launch into it
  3. Replicate the behavior done by the Tibanna Docker (ie: job tracking) in this file: https://github.com/4dn-dcic/tibanna/blob/master/awsf3-docker/run.sh
  4. Figure out a way to reasonably pass jobs to it, as Docker won't work in this case I don't think. This is probably where you will run into the most problems since most jobs require specialized software, and you don't want to put that into the AMI.

I think 1-3 can be accomplished with some leg work, but 4 will prove quite difficult. This is why Tibanna implements the "Docker within a Docker" model, so you can package arbitrary things into your jobs and Tibanna doesn't need to know about it.

willronchetti avatar Mar 13 '24 13:03 willronchetti

My original plan was to make a GPU compatible AMI, with the relevant tibanna dependencies, run my jobs as shell scripts containing singularity commands to allow my jobs to access to GPU on the host machine. Looking through your run script and thinking about it more, that clearly won't work.

Instead I'd probably need to fork tibanna and modify the run script so that gpus are passed (potentially optionally) through the docker calls using something like this, which will further complicate the requirements on the AMI. Getting this to further work with snakemake (which is my preferred method of launching jobs) will likely require significant updates to snakemake as well. At a minimum, I'll need to make a GPU compatible version of the snakemake docker container at which point non-containerized snakemake jobs should be able to access the GPU

IDK, it seems doable to me. Whether its worth doing personally, I'll have to consider.

nhartwic avatar Mar 13 '24 16:03 nhartwic

I was actually looking at the same article!

Looking into this more, it looks like I do have some misunderstanding on how the GPU instances work. They are in fact standard (x86 or AMD) host machines with GPU attachments as evidenced by NVIDIA driver compatible Ubuntu AMIs publicly available that will launch on their "GPU" instances (for example ami-0ef3e9355268c4dbc). So this may actually not be such a heavy lift. You may in fact be able to package the NVIDIA drivers onto our existing AMI and launch it directly on the AWS GPU instances. Worth a try I'd say. The snakemake thing may still be an issue though.

willronchetti avatar Mar 13 '24 17:03 willronchetti