server
server copied to clipboard
nv_inference_request_failure metric does not increase
Description
When I catch an error in BLS and return an InferenceResponse with the error, the nv_inference_request_failure metric does not increase. On the contrary, the nv_inference_request_success metric increases. Or does nv_inference_request_failure only increase when an exception is raised?
Triton Information
nvcr.io/nvidia/tritonserver:24.08-py3
To Reproduce
class TritonPythonModel:
def initialize(self, args):
self.model_config = json.loads(args["model_config"])
self.model_name = str(args["model_name"])
self.model_version = str(args["model_version"])
def execute(self, requests):
responses = []
for request in requests:
try:
raise ValueError("BACKEND ERROR!")
except Exception as e:
inference_response = pb_utils.InferenceResponse(error=pb_utils.TritonError(str(e)))
responses.append(inference_response)
return responses
An error with status code 500 will be returned to the client. But the increased metric will be nv_inference_request_success, not nv_inference_request_failure.
# HELP nv_inference_request_success Number of successful inference requests, all batch sizes
# TYPE nv_inference_request_success counter
nv_inference_request_success{model="test_model_bls",version="1"} 1
# HELP nv_inference_request_failure Number of failed inference requests, all batch sizes
# TYPE nv_inference_request_failure counter
nv_inference_request_failure{model="test_model_bls",reason="BACKEND",version="1"} 0
Expected behavior
When I return an InferenceResponse with an error, the nv_inference_request_failure metric should increase.
Thank you in advance.