serving icon indicating copy to clipboard operation
serving copied to clipboard

Export `:tensorflow:serving:...` metrics by signature names

Open jeongukjae opened this issue 3 years ago • 4 comments

Feature Request

If this is a feature request, please fill out the following form in full:

Describe the problem the feature is intended to solve

For now, tensorflow serving exports metrics by model like below.

...
:tensorflow:serving:request_count{model_name="test_model",status="OK"} 6
...
:tensorflow:serving:request_latency_bucket{model_name="test_model",API="predict",entrypoint="REST",le="10"} 0
:tensorflow:serving:request_latency_bucket{model_name="test_model",API="predict",entrypoint="REST",le="18"} 0
...
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="10"} 0
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="18"} 0
:tensorflow:serving:runtime_latency_bucket{model_name="test_model",API="Predict",runtime="TF1",le="32.4"} 0
...

We cannot collect metrics by signatures, even if the latencies of each signature are very different.

Related codes:

  • https://github.com/tensorflow/serving/blob/21360c763767823b82768ce42c5c90c0c9012601/tensorflow_serving/servables/tensorflow/util.h#L118-L119
  • https://github.com/tensorflow/serving/blob/21360c763767823b82768ce42c5c90c0c9012601/tensorflow_serving/servables/tensorflow/util.h#L122-L123

Describe the solution

It must be better if runtime latency and request latency are recorded with signature names.

Describe alternatives you've considered

Additional context

jeongukjae avatar Jan 03 '22 06:01 jeongukjae

@jeongukjae,

Are you still looking for a resolution? We are planning on prioritising the issues based on the community interests. Please let us know if this issue still persists with the latest TF Serving 1.12.1 release so that we can work on fixing it. Thank you for your contributions.

singhniraj08 avatar Jun 08 '23 08:06 singhniraj08

@singhniraj08 I wrote a PR for this issue https://github.com/tensorflow/serving/pull/2152 I think those patches are enough for this. Can you review that?

jeongukjae avatar Jun 16 '23 01:06 jeongukjae

@jeongukjae, Thank you for your contributions. We will discuss this internally and update this thread. Thanks

singhniraj08 avatar Jun 16 '23 05:06 singhniraj08

@singhniraj08 Thank you.

+

I wrote another issue that is similar to this issue: #2157 Can you discuss that issue too internally?

jeongukjae avatar Jun 22 '23 04:06 jeongukjae