Sean Sheng issues

Results 28 issues of


                                            Sean Sheng

Add configuration option for runner timeout

Long model inference request could timeout the current runner request. Runner timeout is currently hardcoded at 5 minutes. We should make the runner timeout value configurable. SageMaker deployment through bentoctl...

perf: Benchmark API server and runner performance

Benchmark API server and runner performance against various model and input sizes.

Add Algolia Docsearch to a docs

https://github.com/readthedocs/sphinx_rtd_theme/issues/761

documentation

feat: adding custom metrics from API service

There is an increasing demand from the community for adding custom metrics to the API service. BentoML supports basic service level metrics out-of-box, including request duration, in-progress, and count, using...

feature

documentation

chore: Set `OMP_NUM_THREADS` and related env vars prior to importing `numpy`

`OMP_NUM_THREADS` must be set before numpy is imported for it to work, our current implementation doesn’t guarantee that.

feature

feat: Runner to GPUs mapping

### Feature request The default scheduling strategy implementation schedules the same number of runner (`nvidia.com/gpu` supported) instances as the number of available GPUs. If multiple types of runners are present...

feature