serving icon indicating copy to clipboard operation
serving copied to clipboard

A flexible, high-performance serving system for machine learning models

Results 219 serving issues
Sort by recently updated
recently updated
newest added

I wrote the patch that can achieve what I want in #1959 . Close #1959

Using the tf.io ops in the tf.serving ecosystem would be a large development convenience and likely decrease inference latency. Can there be an official docker build or documentation to integrate...

type:feature
needs prio
custom-ops
stale

I discovered a performance issue that Tensorflow Serving has an unexplainable and significant network delay for tail latencies when facing higher loads of traffic. My setup was a client and...

type:feature
stat:awaiting tensorflower

## Bug Report ### System information - **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: Ubuntu 20.04 / Kubernetes on AWS - **TensorFlow Serving installed from (source or binary)**: binary...

stat:awaiting tensorflower
type:bug

## Bug Report If this is a bug report, please fill out the following form in full: ### System information - **OS Platform and Distribution: Linux Ubuntu 16.04 - **TensorFlow...

stat:contributions welcome
stat:awaiting tensorflower
type:bug

Half type is widely used in the deeplearning inference, but the tf-serving doesn't support half type in the restful api, I submit a pr to solve this problem, please check.

cla: yes

Is there a way to cap the number (e.g. CPU cores, CUDA MPS threads) of resources assigned to each model in a multi-model tensorflow server? The only way (straightforward way...

type:feature
stat:awaiting tensorflower

TF Serving should terminate gracefully when SIGTERM is received. This is especially important for docker / kubernetes use cases when a process is terminated gracefully or is killed have very...

type:feature
stat:awaiting tensorflower

don't submit. test ci

REST API call binding with 127.0.0.1(localhost)