BentoML
BentoML copied to clipboard
feature: How to use multi GPU
Feature request
An example of a model that uses multiple gpus at the same time, or we should have a way to make this easy to use
Motivation
No response
Other
No response
@bentoml.service(resources={"gpu": 4})
class MyService:
...
And it will inject CUDA_VISIBLE_DEVICES=0,1,2,3 into the environment
Will it automatically schedule and use all gpus at the same time?
What automation do you mean, that is all it does and it depends on how the framework respects the env var.