server GRPC infer returns null in outputs contents

trafficstars

Description

Hello,

I'm trying to set-up Triton Server for my models. So far everything worked well. My model uses TF2, it was loaded and answers on my request as expected.

I use this docker image to run models: nvcr.io/nvidia/tritonserver:24.03-tf2-python-py3

But the problem is that response with contents in output returns only when I use HTTP interface. When I try to use GRPC, contents is always null.

My request:

{
  "id": "my-random-id",
  "model_name": "my_model",
  "model_version": "1",
  "inputs": [
    {
      "name": "my_input",
      "shape": [
        1,
        63
      ],
      "datatype": "FP32",
      "contents": {
        "fp32_contents": [0, 45, 0, 0, 11, 0, 22, 33, 44, 0, 0, 63, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 6, 3, 1, 8, 23, 11, 23, 44, 43, 2, 25, 63, 34, 15, 12]
      }
    }
  ]
}

And the response is in GRPC:

{
    "outputs": [
        {
            "shape": [
                "1",
                "8"
            ],
            "parameters": {},
            "name": "my_output",
            "datatype": "FP32",
            "contents": null
        }
    ],
    "raw_output_contents": ["..."],
    "parameters": {},
    "model_name": "my_model",
    "model_version": "1",
    "id": "my-random-id"
}

But HTTP interface returns in data expected response:

"data": [
  0.[...],
  0.[...],
  0.[...],
  0.[...],
  0.[...],
  0.[...],
  0.[...],
  0.[...],
]

I've checked /v2/models/my_model/versions/1/stats endpoint where I clearly see that every GRPC infer request increases success numbers for my model.

It looks like the issue is somewhere in GRPC interface.

For protocol I use this GRPC interface: https://github.com/triton-inference-server/common/tree/main/protobuf

Could you please help me to figure out how to fix the issue to return expected response from my model(s)?

Triton Information

I use this docker image: nvcr.io/nvidia/tritonserver:24.03-tf2-python-py3

To Reproduce

Config:

max_batch_size: 10
platform: "tensorflow_savedmodel"
dynamic_batching {}
input [
  {
    name: "my_input",
    data_type: TYPE_FP32,
    dims: [ 63 ]
  }
]
output [
  {
    name: "my_output",
    data_type: TYPE_FP32,
    dims: [ 8 ]
  }
]
instace_group [
  {
    count: 1,
    kind: KIND_CPU
  }
]

Expected behavior

This part of the response in GRPC "contents": null will contain something like:

contents: {
  fp32_contents: [
    0.[...],
    0.[...],
    0.[...],
    0.[...],
    0.[...],
    0.[...],
    0.[...],
    0.[...],
  ]
}

May 07 '24 18:05 aohorodnyk

Hi @aohorodnyk , could you please share the command that you run for GRPC interface? Besides, a minimal reproducer would be really helpful for us to investigate this issue.

May 09 '24 22:05 krishung5

@krishung5 to use code I use Golang, but for testing it reproducible through Postman GRPC integration. JSON that I provided as a request, it's an actual request I use from Postman.

Golang code is more complicated, but the result is the same, so I do not see any reasons to write code for that, if simple GUI tool is available.

Could you please describe, how can I help you to reproduce the issue?

Thank you!

May 09 '24 23:05 aohorodnyk

server server copied to clipboard

GRPC infer returns null in outputs contents

server
server copied to clipboard