BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

Results 258 BentoML issues
Sort by recently updated
recently updated
newest added

### Feature request We are using diffusers SDXL + LoRA Potentially we gonna have lots of LoRA files that we want to manage. we would like to be able to...

enhancement

### Describe the bug Hello, Currently, the Bento REST API handles inference issues, such as IO errors or GPU out-of-memory errors, by returning a 500 internal server error to the...

bug

## What does this PR address? supporting limit max memory usage when pushing models ![image](https://github.com/bentoml/BentoML/assets/141706136/42c364fb-fc5b-495e-8973-b867244d109e) ``` bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254 --maxmemory 2 ``` Test case 1: pushing `bento google--flan-t5-large-service`, model size...

### Describe the bug Hi, I use bentoml, the bento service is a simple BERT. I saw there was a bentoml.ray feature, I tried it. But got an error :...

bug

Hi. Are bentoml planning to support on-demand loading? My use case is, that I have multiple models which can't be all loaded at the same time, so I want to...

### Describe the bug When using a runner in a service, the service outputs opentelemetry traces even for requests to urls included in excluded_urls in the config. This is most...

bug

### Describe the bug When I retrieve a float number as output, it appears to be inaccurately rounded. ```py @svc.api(input=NumpyNdarray(), output=NumpyNdarray(dtype=np.float32)) async def predict(arr: np.ndarray) -> float: return np.round(25.1234, 2)...

bug

### Describe the bug docker check healthy also need root privilege, but `_internal/container/docker.py` just ```python [client, "version", "--format", "{{json .Server.Version}}"] ``` sudo is needed (but I don't know how to...

bug

### Describe the bug I recently experienced some dtype mismatch errors when using model.run() with numpy.float16 input, when the pytorch model's dtype is torch.float16. After inspection, I found that the...

bug

Always prepend current bentofile directory to system path to avoid unwanted behavior when other bentofiles are on the system PATH This is especially evident when trying to use bentoml cli...