server
server copied to clipboard
Dynamic Batching with Python Backend
I have a python backend model with the following config.pbtxt without Dynamic Batching.
name: "sample"
backend: "python"
max_batch_size: 0
input [
{
name: "text" # Stringified JSON Array
data_type: TYPE_STRING
dims: [ 1 ] # Dynamic Batching
},
{
name: "config" # Stringified JSON Array
data_type: TYPE_STRING
dims: [ 1 ]
}
]
output [
{
name: "results"
data_type: TYPE_STRING
dims: [ 1 ]
}
]
instance_group [
{
count: 1
kind: KIND_CPU
}
]
response_cache {
enable: True
}
Here's my model.py execute method.
for request in requests:
try:
input_text_bytes = pb_utils.get_input_tensor_by_name(request, "text")
input_text = [json.loads(text.decode()) for text in input_text_bytes.as_numpy()]
config_tensor = pb_utils.get_input_tensor_by_name(request, "config")
config = [json.loads(text.decode()) for text in config_tensor.as_numpy()]
responses = []
for idx, ind_input_text in enumerate(input_text):
ind_input_config = config[idx]
ind_input_config["doc"] = ind_input_text
results = self.sample_model.run(ind_input_config)["results"]
responses.append(results)
responses_req.append(self.create_inference_response(responses))
except Exception as e:
responses = [{}]
responses_req.append(self.create_inference_response(responses))
return responses_req
I would like to enable dynamic batching without any changes to the client side (no changes to the REST API input format). Modified config.pbtxt with dynamic batching
name: "sample"
backend: "python"
max_batch_size: 8
dynamic_batching {}
input [
{
name: "text"
data_type: TYPE_STRING
dims: [ -1 ] # Dynamic Batching
}
]
input [
{
name: "config"
data_type: TYPE_STRING
dims: [ -1 ]
}
]
output [
{
name: "results"
data_type: TYPE_STRING
dims: [ -1 ]
}
]
instance_group [
{
count: 1
kind: KIND_CPU
}
]
response_cache {
enable: True
}
Is it possible? If yes, what changes to be done in model.py ?