Chaoyu
Chaoyu
Support using gRPC instead of HTTP API for sending prediction requests. When an API model server is deployed as a backend service, many teams prefer using gRPC over HTTP. See...
**Is your feature request related to a problem? Please describe.** Currently, Yatai provides a few ways to create and manage deployments: * Web UI (requires yatai account logged in) *...
- [ ] `max-latency` & `timeout` - [x] api server timeout - [x] provide both max-latency and timeout in BentoServer config - [x] default `max-latency`: `10s` - [ ] default...
Adding support for using models trained with Pytorch ignite in BentoML * sample notebook showing how the integration could work * verify that the current `bentoml.pytorch` module can adapt to...
The Bento build process requires loading the `bentoml.Service` object for validating the definition and retrieving required models to package. Currently, running `bentoml build` requires all Service dependencies to be installed,...
Support importing models saved in the neuropod format https://github.com/uber/neuropod
**Is your feature request related to a problem? Please describe.** Currently there is not too much visibility for the user to visualize or to understand how BentoML's micro-batching works. having...
**Is your feature request related to a problem? Please describe.** Streaming serving is a common pattern for deploying models, where the ML model is applied on streaming data sets. **Describe...