efficientvit icon indicating copy to clipboard operation
efficientvit copied to clipboard

Cannot run inference on multiple boxes for TensorRT Efficient VIT SAM

Open aniket03 opened this issue 11 months ago • 2 comments

Hello, Thank you for the awesome work.

I was trying to get inference from the tensorrt SAM deployment when there are multiple bounding boxes in the prompt. However, I haven't been able to get this to work. Trying one sample with number of boxes=2 results in thee following error:

[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::validateInputBindings::2082] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::validateInputBindings::2082, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [1,4,2] for bindings[1] exceed min ~ max range at index 1, maximum dimension in profile is 2, minimum dimension in profile is 1, but supplied dimension is 4.
)
[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::validateInputBindings::2082] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::validateInputBindings::2082, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [1,4] for bindings[2] exceed min ~ max range at index 1, maximum dimension in profile is 2, minimum dimension in profile is 1, but supplied dimension is 4.
)
[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2791] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2791, condition: allInputDimensionsSpecified(routine)
)
[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2791] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2791, condition: allInputDimensionsSpecified(routine)
)
[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2791] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2791, condition: allInputDimensionsSpecified(routine)
)
[03/05/2024-00:13:13] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2791] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2791, condition: allInputDimensionsSpecified(routine)
)
Traceback (most recent call last):
  File "/home/ubuntu/efficientvit/deployment/sam/tensorrt/inference.py", line 244, in <module>
    low_res_masks, _ = trt_decoder.infer(inputs)
  File "/home/ubuntu/efficientvit/deployment/sam/tensorrt/inferencer.py", line 303, in infer
    self.numpy_array[idx][:actual_batch_size] = inp.reshape(actual_batch_size, -1)
ValueError: could not broadcast input array from shape (2,4) into shape (1,8)

I am passing the bounding boxes as a string list in the following format

--boxes "[[127, 219, 592, 736], [235, 6, 431, 115]]"

aniket03 avatar Mar 05 '24 00:03 aniket03

Hi, I faced the same problem few days ago, and I solved it.

Basically the original Decoder input binding could not match for multiple boxes.

Here was how I did:

Step 1. Find the inferencer.py file

.../efficientvit/deployment/sam/tensorrt/inferencer.py

Step 2. Modify the binding code in SAMDecoderInferencer class

# Original code:
# Allocate memory for multiple usage [e.g. multiple batch inference]
self._input_shape = []
self.context = self.engine.create_execution_context()
for binding in range(self.engine.num_bindings):
    # set binding_shape for dynamic input
    if self.engine.binding_is_input(binding):
        _input_shape = list(self.engine.get_binding_shape(binding)[1:])
        if binding != 0:
            _input_shape[0] = num
        self._input_shape.append(_input_shape)
        self.context.set_binding_shape(binding, [batch_size] + _input_shape)

To:

# Modified code
# Allocate memory for multiple usage [e.g. multiple batch inference]
self._input_shape = []
self.context = self.engine.create_execution_context()
for binding in range(self.engine.num_bindings):
    # set binding_shape for dynamic input
    if self.engine.binding_is_input(binding):
        _input_shape = list(self.engine.get_binding_shape(binding)[1:])
        if binding != 0:
            _input_shape = list(self.engine.get_binding_shape(binding))
            _input_shape[0] = num // 2
            _input_shape[1] = 2
        self._input_shape.append(_input_shape)
        if binding == 0:
            self.context.set_binding_shape(binding, [batch_size] + _input_shape)
        else:
            self.context.set_binding_shape(binding, _input_shape)

Step 3. Modify actual_batch_size in infer function (same in SAMDecoderInferencer class)

# Original code:
for idx, inp in enumerate(inputs):
    actual_batch_size = len(inp)
    self.numpy_array[idx][:actual_batch_size] = inp.reshape(actual_batch_size, -1)
    np.copyto(self.inputs[idx].host, self.numpy_array[idx].ravel())

To:

# Modified code:
for idx, inp in enumerate(inputs):
    aactual_batch_size = 1
    self.numpy_array[idx][:actual_batch_size] = inp.reshape(actual_batch_size, -1)
    np.copyto(self.inputs[idx].host, self.numpy_array[idx].ravel())

Finally, I think this will be ok for running multiple boxes.

asd841018 avatar Mar 07 '24 08:03 asd841018

Hi @aniket03 and @asd841018,

Thank you for raising this issue and trying to solve it. I have made the necessary updates to the code to address the issue. You can try it out whether it works for you now. Should you have any further questions, please don't hesitate to reach out. Thank you!

Best, Zhuoyang

zhuoyang20 avatar Mar 10 '24 07:03 zhuoyang20