serve Metrics API returns empty response until TS process serves a prediction

Context

When the torchserve process initially starts, metrics API endpoints tested return an empty response. While this is a niche case (most likely TS would have served at least 1 prediction before the user calls the metrics APIs), it seems as though the API is broken.

torchserve version: Installed from source on latest master
torch version: 1.6.0
torchvision version 0.7.0
java version: openjdk-11
Operating System and version: Ubuntu 18.04

Your Environment

Installed using source? [yes/no]: Yes
Are you planning to deploy it using docker container? [yes/no]:N/A
Is it a CPU or GPU environment?: CPU
Using a default/custom handler? [If possible upload/share custom handler/model]:No
What kind of model is it e.g. vision, text, audio?:N/A
Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.? [If public url then provide link.]:N/A
Provide config.properties, logs [ts.log] and parameters used for model registration/update APIs: Did not use a config.properties (so just the default config)
Link to your project [if any]:N/A

Expected Behavior

The /metrics endpoint should return the list of metrics whether or not a prediction had been made. The /metrics?name endpoint should return 0s or display that there were no request logged.

Current Behavior

Both endpoints returns an empty response.

Possible Solution

Steps to Reproduce

How to reproduce:

Start a new Torchserve process.
Run curl http://127.0.0.1:8082/metrics and verify that there is no response.
Make a call to a prediction endpoint. e.g. curl http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg
Run curl http://127.0.0.1:8082/metrics again and verify the expected response is returned (the list of metrics).

Same behavior for curl "http://127.0.0.1:8082/metrics?name[]=ts_inference_latency_microseconds&name[]=ts_queue_latency_microseconds" --globoff

Failure Logs [if any]

Note that the metrics API doesn't return results until the first prediction is served. metrics_api_success.txt

Oct 16 '20 05:10 jeremiahschung

This should be taken up as part of converting Prometheus integration into a plugin (#611).

@maheshambule Could you please validate this?

Oct 16 '20 05:10 harshbafna

@harshbafna A simple fix with message "No data" in the body should be fine. At present it is confusing for users when they get 200K response but nothing in metrics on initial install and think the metrics endpoint is broken. Adding this message is independent of prometheus integration

Oct 17 '20 15:10 chauhang

@harshbafna Looking at the prometheus java client code, it seems a metric is initialized during register only if doesn't have a label

Oct 20 '20 18:10 maaquib

Currently, I still have similar problems, but even after model inference, an empty result is still returned curl http://127.0.0.1:8082/metrics

Jan 25 '24 07:01 pengxin233

@pengxin233 did you solve this problem? I am experiencing this issue.

print(requests.get("http://127.0.0.1:8082/metrics?").content) returns b'' even if I have done inference.

I do see some metrics in the pod's log: [INFO ] W-9000-mnist_1.0 TS_METRICS - ts_queue_latency_microseconds.Microseconds:98.194|#model_name:mnist,model_version:default

I am using the MNIST sample. The config in that container is:

$ cat /mnt/models/config/config.properties
inference_address=http://0.0.0.0:8085
management_address=http://0.0.0.0:8085
metrics_address=http://0.0.0.0:8082
grpc_inference_port=7070
grpc_management_port=7071
enable_metrics_api=true
metrics_format=prometheus
number_of_netty_threads=4
job_queue_size=10
enable_envvars_config=true
install_py_dep_per_model=true
model_store=/mnt/models/model-store
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"mnist":{"1.0":{"defaultVersion":true,"marName":"mnist.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":10,"responseTimeout":120}}}}

I also see some metrics in /home/model-server/logs/ts_metrics.log

Jun 07 '24 20:06 hfingler

serve serve copied to clipboard

Metrics API returns empty response until TS process serves a prediction

Context

Your Environment

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Failure Logs [if any]

serve
serve copied to clipboard