TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] AssertionError: end must be an integer

Open Fulitcher opened this issue 8 months ago • 22 comments

Bug Description

To Reproduce

Steps to reproduce the behavior:

  1. Prepare torch-tensorrt convert code
model = MyModel(opt)
model.load_staste_dict(torch.load("model.pth"))
inputs = torch_tensorrt.Input(min_shape=[int(min_batch_size), channel, height, width],
                                   opt_shape=[int(max_batch_size // 2), channel, height, width],
                                   max_shape=[int(max_batch_size), channel, height, width],
                                   dtype=torch.float32)
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "rt_model.ep", inputs=inputs)
  1. Error occured as below,
[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1801: UserWarning: Node prediction_src_pe_lifted_tensor_0 target Prediction.src_pe.lifted_tensor_0 lifted_tensor_0 of Prediction.src_pe does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target
  warnings.warn(

[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1801: UserWarning: Node prediction_src_pe_lifted_tensor_1 target Prediction.src_pe.lifted_tensor_1 lifted_tensor_1 of Prediction.src_pe does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target
  warnings.warn(

[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1801: UserWarning: Node prediction_trg_pe_lifted_tensor_2 target Prediction.trg_pe.lifted_tensor_2 lifted_tensor_2 of Prediction.trg_pe does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target
  warnings.warn(

[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1801: UserWarning: Node prediction_trg_pe_lifted_tensor_3 target Prediction.trg_pe.lifted_tensor_3 lifted_tensor_3 of Prediction.trg_pe does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target
  warnings.warn(

[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1801: UserWarning: Node prediction_trg_pe_lifted_tensor_4 target Prediction.trg_pe.lifted_tensor_4 lifted_tensor_4 of Prediction.trg_pe does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target
  warnings.warn(

[WARNING  | py.warnings        ]: /usr/local/lib/python3.12/dist-packages/torch/fx/graph.py:1810: UserWarning: Additional 22 warnings suppressed about get_attr references
  warnings.warn(

Traceback (most recent call last):
  File "convert_torch_tensorrt.py", line 146, in <module>
    convert_tensorrt(opt)
  File "convert_torch_tensorrt.py", line 50, in convert_tensorrt
    trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch_tensorrt/_compile.py", line 289, in compile
    trt_graph_module = dynamo_compile(
                       ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch_tensorrt/dynamo/_compiler.py", line 670, in compile
    exported_program = exported_program.run_decompositions(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/export/exported_program.py", line 128, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/export/exported_program.py", line 1310, in run_decompositions
    return _decompose_exported_program(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/export/exported_program.py", line 784, in _decompose_exported_program
    ) = _decompose_and_get_gm_with_new_signature_constants(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/export/exported_program.py", line 472, in _decompose_and_get_gm_with_new_signature_constants
    aten_export_artifact = _export_to_aten_ir(
                           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/export/_trace.py", line 743, in _export_to_aten_ir
    gm, graph_signature = transform(aot_export_module)(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/aot_autograd.py", line 1357, in aot_export_module
    fx_g, metadata, in_spec, out_spec = _aot_export_function(
                                        ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/aot_autograd.py", line 1596, in _aot_export_function
    fx_g, meta = create_aot_dispatcher_function(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/aot_autograd.py", line 582, in create_aot_dispatcher_function
    return _create_aot_dispatcher_function(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/aot_autograd.py", line 832, in _create_aot_dispatcher_function
    compiled_fn, fw_metadata = compiler_fn(
                               ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 118, in aot_dispatch_export
    graph, _, _ = aot_dispatch_base_graph(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py", line 153, in aot_dispatch_base_graph
    fw_module = _create_graph(
                ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py", line 55, in _create_graph
    fx_g = make_fx(
           ^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 2200, in wrapped
    return make_fx_tracer.trace(f, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 2138, in trace
    return self._trace_inner(f, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 2109, in _trace_inner
    t = dispatch_trace(
        ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_compile.py", line 51, in inner
    return disable_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py", line 755, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1142, in dispatch_trace
    graph = tracer.trace(root, concrete_args)  # type: ignore[arg-type]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1698, in trace
    res = super().trace(root, concrete_args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py", line 755, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/_symbolic_trace.py", line 843, in trace
    (self.create_arg(fn(*args)),),
                     ^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1197, in wrapped
    out = f(*tensors)  # type:ignore[call-arg]
          ^^^^^^^^^^^
  File "<string>", line 1, in <lambda>
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/traced_function_transforms.py", line 693, in inner_fn
    outs = fn(*args)
           ^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/traced_function_transforms.py", line 413, in _functionalized_f_helper
    f_outs = fn(*f_args)
             ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/traced_function_transforms.py", line 78, in inner_fn
    outs = fn(*args)
           ^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/utils.py", line 184, in flat_fn
    tree_out = fn(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_functorch/_aot_autograd/traced_function_transforms.py", line 875, in functional_call
    out = PropagateUnbackedSymInts(mod).run(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/interpreter.py", line 167, in run
    self.env[node] = self.run_node(node)
                     ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/symbolic_shapes.py", line 6826, in run_node
    result = super().run_node(n)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/interpreter.py", line 230, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/interpreter.py", line 310, in call_function
    return target(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 758, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1245, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 758, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/_subclasses/functional_tensor.py", line 527, in __torch_dispatch__
    outs_unwrapped = func._op_dk(
                     ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/utils/_stats.py", line 26, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 1347, in __torch_dispatch__
    return proxy_call(self, func, self.pre_dispatch, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 793, in proxy_call
    r = maybe_handle_decomp(proxy_mode, func, args, kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/fx/experimental/proxy_tensor.py", line 2268, in maybe_handle_decomp
    out = CURRENT_DECOMPOSITION_TABLE[op](*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch_tensorrt/dynamo/lowering/_decompositions.py", line 208, in slice_scatter_decomposition
    assert isinstance(end, int), "end must be an integer"
           ^^^^^^^^^^^^^^^^^^^^
AssertionError: end must be an integer

While executing %copy_ : [num_users=0] = call_function[target=torch.ops.aten.copy_.default](args = (%slice_11, %clone_1), kwargs = {})
Original traceback:
  File "/model.py", line 60, in forward
    outs, probs = self.Prediction(features)
  File "prediction.py", line 85, in forward
    probs[:, step, :] = prob.clone().detach()
  1. My code for model(prediction.py) is like below,
    def forward(self, src):
        memory = self.encode(src)
        b = memory.size(0)

        # filling with [BOS](index=1)
        outs = torch.ones(b, 1).fill_(1).long().to(self.device)
        probs = torch.zeros(b, self.max_len, self.num_class).to(self.device)

        for step in range(self.max_len - 1):
            probs = probs.clone().detach()
            # [B, step+1, d_model]
            out = self.decode(memory, outs, subsequent_mask(outs.size(1)).long().to(self.device))
            prob = self.generator(out[:, -1])  # [B, num_class]
            _, next_word = torch.max(prob, dim=1)   # [B]

            outs = torch.cat([outs, next_word.unsqueeze(1)], dim=1)  # [B, step+2]
            probs[:, step, :] = prob.clone().detach()  # <--- Error occur at this line

        return outs, probs

Expected behavior

"rt_model.ep" model file must be created and saved.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 2.6.0a0
  • PyTorch Version (e.g. 1.0): 2.7.0a0+ecf3bae40a.nv25.2
  • CPU Architecture: x86_64
  • OS (e.g., Linux): Ubuntu 24.04.1 LTS
  • How you installed PyTorch (conda, pip, libtorch, source): Docker image "PyTorch Release 25.02" at link
  • Python version: 3.12.3
  • CUDA version: 12.8
  • GPU models and configuration: RTX4000

Additional context

Even if I do with adding or removing functions such as clone() and detach() to the 'probs' variable in the example code where the error occurred, the same error occurs.

!!! important !!! Model transformation just succeeds when static batch is set as inputs like below.

inputs = [torch.randn((1, channel, height, width)).cuda()]

Setting dynamic batch as inputs result the error as suggested. Is there a way to take dynamic batch as inputs like the sample code provided as an example?

Fulitcher avatar Mar 20 '25 04:03 Fulitcher

@apbose please look at this bug

narendasan avatar Mar 20 '25 23:03 narendasan

Thanks for the issue. Trying to repro the above. Couple of questions- what are the values of min_batch_size, max_batch_size, channel, height and width you are using. Also the subsequent_mask? Do you have a simple repro?

apbose avatar Mar 24 '25 22:03 apbose

@apbose Thank you for reply, min_batch_size = 1 max_batch_size = 200 channel = 3 height = 32 width = 128

subsequent_mask() looks like,

def subsequent_mask(size):
    attn_shape = (1, size, size)
    subsequent_mask = np.triu(np.ones(attn_shape), k=1).astype('uint8')
    return torch.from_numpy(subsequent_mask) == 0

The whole repo of my project is customized and hard to summary, but I wish given sample code enough to reproduce the issue.

Fulitcher avatar Mar 24 '25 23:03 Fulitcher

There are a couple of other things missing for the repro. The opt in model = MyModel(opt) is missing. Also I do not have the model.pth which is used in model.load_state_dict(torch.load("model.pth"))

apbose avatar Mar 26 '25 04:03 apbose

@apbose MyModel is a model composed of a transformer encoder and a decoder. However, it is difficult to share the entire structure of the model. Is it impossible to debug with only the given example? opt is nothing special. It is an arguments with the following values ​​defined: --img_w, --img_h, --transformer_encoder_layer_num, ...

Fulitcher avatar Mar 26 '25 06:03 Fulitcher

Hmm I would need the code to repro the error and see what is going on. Looks like the lowering pass is not being able to handle a dynamic case. Its fine not to have the model.pth but I would need the encode and decode model. Thats why was wondering if you could provide a simple repro or provide the model paths.

apbose avatar Mar 26 '25 18:03 apbose

@apbose Okay, then give me time to summarize the model code. There are several files and could you share your email to get code files?

Fulitcher avatar Mar 27 '25 01:03 Fulitcher

[email protected] you could share here. You could point here.

apbose avatar Mar 31 '25 18:03 apbose

@apbose I did send you files last week :)

Fulitcher avatar Apr 07 '25 00:04 Fulitcher

I cannot find it. Could you please let me know the mail id from which you mailed.

apbose avatar Apr 08 '25 16:04 apbose

[email protected] just sent again.

Fulitcher avatar Apr 08 '25 23:04 Fulitcher

Thanks received, I will take a look.

On Tue, Apr 8, 2025, 4:20 PM Fulitcher @.***> wrote:

@.*** just sent again.

— Reply to this email directly, view it on GitHub https://github.com/pytorch/TensorRT/issues/3448#issuecomment-2787851736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKRJMRZO65FR7EJUVHG3OYD2YRKSFAVCNFSM6AAAAABZMN4ZK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBXHA2TCNZTGY . You are receiving this because you were mentioned.Message ID: @.***> Fulitcher left a comment (pytorch/TensorRT#3448) https://github.com/pytorch/TensorRT/issues/3448#issuecomment-2787851736

@.*** just sent again.

— Reply to this email directly, view it on GitHub https://github.com/pytorch/TensorRT/issues/3448#issuecomment-2787851736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKRJMRZO65FR7EJUVHG3OYD2YRKSFAVCNFSM6AAAAABZMN4ZK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBXHA2TCNZTGY . You are receiving this because you were mentioned.Message ID: @.***>

apbose avatar Apr 10 '25 07:04 apbose

@Fulitcher I see this error. Please note that I am skipping the lines

checkpoint = torch.load(model_path)

#remove key name string of nn.DataParallel

new_state_dict = {}
for k, v in checkpoint.items():
    name = k[7:] if k.startswith('module.') else k  # 'module.' 제거
    new_state_dict[name] = v
model.load_state_dict(new_state_dict)
print(">> Succesfully loaded checkpoint '{}'".format(model_path))

Can that have an effect?

File "/code/torchTRT/torchTRT_bug/TensorRT/model/MyModel/convert_torch_tensorrt.py", line 46, in convert_tensorrt                                                  
    trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)                                                                                               
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                               
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/_compile.py", line 289, in compile                                                 
    trt_graph_module = dynamo_compile(                                                                                                                               
                       ^^^^^^^^^^^^^^^                                                                                                                               
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/_compiler.py", line 693, in compile                                         
    trt_gm = compile_module(                                                                                                                                         
             ^^^^^^^^^^^^^^^                                                                                                                                         
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/_compiler.py", line 897, in compile_module                                  
    trt_module = convert_module(                                                                                                                                     
                 ^^^^^^^^^^^^^^^                                                                                                                                     
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 90, in convert_module                      
    interpreter_result = interpret_module_to_result(                                                                                                                 
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                 
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 69, in interpret_module_to_result          
    interpreter_result = interpreter.run()                                                                                                                           
                         ^^^^^^^^^^^^^^^^^                                                                                                                           
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 725, in run                            
    self._construct_trt_network_def()                                                                                                                                
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 393, in _construct_trt_network_def     
    super().run()                                                                                                                                                    
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch/fx/interpreter.py", line 171, in run                                                        
    self.env[node] = self.run_node(node)                                                                                                                             
                     ^^^^^^^^^^^^^^^^^^^                                                                                                                             
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 784, in run_node                       
    trt_node: torch.fx.Node = super().run_node(n)                                                                                                                    
                              ^^^^^^^^^^^^^^^^^^^                                                                                                                    
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch/fx/interpreter.py", line 240, in run_node                                                   
    return getattr(self, n.op)(n.target, args, kwargs)                                                                                                               
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                               
  File "/root/.pyenv/versions/3.11.12/lib/python3.11/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 941, in output                         
    raise RuntimeError(                                                                                                                                              
RuntimeError: Specified output dtypes (60) differ from number of outputs (86) 

apbose avatar Apr 22 '25 20:04 apbose

@apbose Do you mean that you commented out the following code block and then ran it? This block of code is intended to remove the 'module.' prefix from the keys of the model weights, which was added because the original model was trained on multiple GPUs using "nn.DataParallel()"

new_state_dict = {}
for k, v in checkpoint.items():
    name = k[7:] if k.startswith('module.') else k  # 'module.' 제거
    new_state_dict[name] = v
model.load_state_dict(new_state_dict)

Fulitcher avatar Apr 23 '25 07:04 Fulitcher

Ok got it. Looks like I am getting a different error where the no of outputs differs from the output dtypes. Need to look into this further. I commented out the lines since I do not have the model path- mymodel.pth. So I am not loading the model weights from the checkpoint. But ideally that should not affect the model compilation I assume, or lead to the error above.

apbose avatar Apr 23 '25 18:04 apbose

Yep, Thanks for taking a look. I still don't understand why it works when I fix the inpus batch to 1, but I got an error when I set it to dynamic batch. If you find a hint, please let me know, even if it takes time. Thank you :)

Fulitcher avatar Apr 23 '25 23:04 Fulitcher

So when I run your model, I see that there is mismatch betweem the no of outputs and the dtype, since there are symints and symints dependent ops which are appearing in the output. For whom dtypes are not allocated. eg: div_2, sym_size_int_3669, mul_102, mul_114, _reshape_copy_3, mul_213, div_113, mul_2962, clone_57, select_1, slice_13, _to_copy_2, slice_16, slice_17, _to_copy_3, mul_4877, _to_copy_6, _to_copy_7, mul_7255, _to_copy_10, _to_copy_11, mul_9633, _to_copy_14, _to_copy_15, mul_12011, _to_copy_18, _to_copy_19, mul_14389, _to_copy_22, _to_copy_23, mul_16767, _to_copy_26, _to_copy_27, _to_copy_30, _to_copy_31, mul_21523, _to_copy_34, _to_copy_35, mul_23901, _to_copy_38, _to_copy_39, mul_26279, _to_copy_42, _to_copy_43, mul_28657, _to_copy_46, _to_copy_47, mul_31035, _to_copy_50, _to_copy_51, mul_33413, _to_copy_54, _to_copy_55, mul_35791, _to_copy_58, _to_copy_59, mul_38169, _to_copy_62, _to_copy_63, mul_40547, _to_copy_66, _to_copy_67, mul_42925, _to_copy_70, _to_copy_71, mul_45303, _to_copy_74, _to_copy_75, mul_47681, _to_copy_78, _to_copy_79, mul_50059, _to_copy_82, _to_copy_83, mul_52437, _to_copy_86, _to_copy_87, mul_54815, _to_copy_90, _to_copy_91, mul_57193, _to_copy_94, _to_copy_95, mul_59571, _to_copy_98, _to_copy_99 above length is 86 But sym_size_int and other ops dependent dont have dtypes appended/ May I know which version of torch and torch tensorrt are you running the code with?

apbose avatar Apr 28 '25 19:04 apbose

Ok you have mentioned it above. Let me try with the above versions

apbose avatar Apr 28 '25 19:04 apbose

I could repro with the above versions. Working on fix

apbose avatar Apr 29 '25 04:04 apbose

@apbose Can I know How this issue on working?

Fulitcher avatar Jun 12 '25 01:06 Fulitcher

This PR https://github.com/pytorch/TensorRT/pull/3513/files should address the above. Could you once try with this and the latest torchTRT release?

apbose avatar Jun 12 '25 17:06 apbose

Upon checking, PyTorch Release 25.05 appears to be the latest version. (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-25-04.html#) Torch-TensorRT version in that image is 2.8.0a0. Do you think the fixes have been applied in that version?

Fulitcher avatar Jun 12 '25 23:06 Fulitcher

The PR 3513 is merged. Please try with the latest and 25.05 should have it. Reopen incase this exists.

peri044 avatar Sep 28 '25 06:09 peri044