🐛 [Bug] [NGC] L0 Dynamo Test on Thor
Bug:
FAILED conversion/test_scalar_tensor_aten.py::TestScalarTensorConverter::test_scalar_tensor_float_1 FAILED conversion/test_index_aten.py::TestIndexConverter::test_index_zero_two_dim_ITensor_mask
TRT 10.13.3.9 Pytorch 2.10.0a0+b558c986e8
Error:
2025-10-11T19:58:31.844970Z 01O ------------------------------ Captured log call -------------------------------
2025-10-11T19:58:31.844990Z 01O WARNING torch_tensorrt [TensorRT Conversion Context]:logging.py:24 Environment variable NVIDIA_TF32_OVERRIDE=0 but BuilderFlag::kTF32 is set. Disabling TF32.
2025-10-11T19:58:31.845010Z 01O WARNING torch_tensorrt [TensorRT Conversion Context]:logging.py:24 Environment variable NVIDIA_TF32_OVERRIDE=0 but BuilderFlag::kTF32 is set. Disabling TF32.
2025-10-11T19:58:31.845030Z 01O ERROR torch_tensorrt [TensorRT Conversion Context]:logging.py:22 Error Code: 9: Skipping tactic 0x00000000000003e8 due to exception cudaEventElapsedTime In executeAndTimeIters at optimizer/common/builderUtils.cpp:1026
2025-10-11T19:58:31.845060Z 01O ERROR torch_tensorrt [TensorRT Conversion Context]:logging.py:22 Error Code: 9: Skipping tactic 0x0000000000000000 due to exception cudaEventElapsedTime In executeAndTimeIters at optimizer/common/builderUtils.cpp:1026
Some of the test failures above are due to the non zero unsupported case on Thor.
Others fail with the issue of
2025-10-30T04:16:27.150332Z 01O ERROR torch_tensorrt [TensorRT Conversion Context]:logging.py:22 IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node [ShapeHostToDeviceCopy 0]. In computeCosts at optimizer/common/tactic/optimizer.cpp:4115)
eg: test_full_aten.py fails in the static case. The graph does not encounter full operation though
graph():
%x : [num_users=0] = placeholder[target=x]
%_tensor_constant0 : [num_users=1] = get_attr[target=_tensor_constant0]
return _tensor_constant0
Looks like incomplete cuda context initialization while selecting TRT tactic. Following up with TRT team.