BentoML
BentoML copied to clipboard
triton: supports unload/load model on demand
Feature request
Implement MODEL_CONTROL_MODE to be explicit and allow given model to be loaded on demand.
We should also provide ability to teardown model after a period of time, that can be configured via configuration.
Motivation
No response
Other
No response