server icon indicating copy to clipboard operation
server copied to clipboard

"TYPE_STRING" - Input_buffer Showing '\0\0\0\0' null terminators as delimiters

Open KCHemaPrasanna opened this issue 2 years ago • 4 comments

Description I am trying to analyse input_buffer (Json input) to aim how actually the "TYPE_STRING" inference request is getting converted into byte arrays by Triton .

Triton Information What version of Triton are you using? 23.01

Code:

This is how my input_buffer has been define as a char pointer.

const char* input_buffer;
  size_t input_buffer_byte_size;
  for(size_t i = 0; i < num_inputs; i++){
    .......
    TRITONSERVER_MemoryType input_buffer_memory_type;
    int64_t input_buffer_memory_type_id;

    RESPOND_ALL_AND_SET_NULL_IF_ERROR(
        responses, request_count,
        collector.ProcessTensor(
            input_def.name.c_str(), nullptr /* existing_buffer */,
            0 /* existing_buffer_byte_size */, allowed_input_types, &input_buffer,
            &input_buffer_byte_size, &input_buffer_memory_type,
            &input_buffer_memory_type_id));

Input

"inputs" : [
    {
      "name" : "IN0",
      "shape" : [35],
      "datatype" : "BYTES", 
      "data" : ["55.84426368", "77.68906853", "58.52193997", "51.0942335", "3.6775438", "3.78218398", "50.68278814", "19.28227734", "0.51352506", "15.95145969", "74.41938298", "50.64301062", "45.29132748", "20.3630148", "32.10634336", "plnb","ukod","nash","benn","ynog","ipng","etxg","viem","nadj","lezw","iehe","jijr","vysm","ebzj","shiz","wfor","uffj","wyak","wnti","qeqk"]
    }

I have interpreted the input buffer to see if there are any delimiters between each string.

image

I output that I got as shown in below pic:

image image

It is seen that before each string 4 null terminators are present. is it common for all the string inputs when we infer through triton?

KCHemaPrasanna avatar Nov 21 '23 16:11 KCHemaPrasanna

It is hard for me to tell if the outcome is correct or not, but I think you can take a look at ReadDataFromJson() which converts the input "data" : ["55.84426368", ... , "qeqk"] from JSON into Triton inference request input data, and see if its output would match what you saw on your backend implementation.

cc @GuanLuo if you can spot anything wrong with a quick glance.

kthui avatar Nov 21 '23 23:11 kthui

@kthui Thanks for info , but as I see in ReadDataFromJson() it doesn't explicitly mention anything about this extra delimiters. @GuanLuo @kthui Please let me know if my analysis need to be changed .

KCHemaPrasanna avatar Dec 05 '23 12:12 KCHemaPrasanna

I also wonder what is the encoding that TYPE_STRING expects at decoding time? Utf-8? Utf16 with bom? Utf-32?

vadimkantorov avatar Dec 07 '23 11:12 vadimkantorov

I wonder whether the server expects any encoding format while sending the input json of TYPE_STRING . @vadimkantorov @kthui @GuanLuo any idea on this? Please help!

KCHemaPrasanna avatar Dec 08 '23 05:12 KCHemaPrasanna