Torchvision Faster R-CNN onnx export with dynamic batch size fails during inference
🐛 Describe the bug
from torchvision import models
frcnn = models.detection.fasterrcnn_resnet50_fpn_v2(pretrained=True)
import io
import torch
x = torch.rand(4, 3, 224, 224)
with io.BytesIO() as f:
torch.onnx.export(
frcnn,
x,
f,
export_params=True,
opset_version=20,
do_constant_folding=True,
keep_initializers_as_inputs=None,
custom_opsets={"moka": 20},
input_names=["images"],
output_names=["output"],
dynamic_axes={
"images": {0: "batch_size", 2: "height", 3: "width"},
"output": {0: "batch_size"},
},
dynamo=False,
)
onnx_model = f.getvalue()
import onnxruntime as ort
providers = ["CUDAExecutionProvider"] if torch.cuda.is_available() else ["CPUExecutionProvider"]
# use different batch size from x
ort_session = ort.InferenceSession(onnx_model , providers=providers)
ort_inputs = {
ort_session.get_inputs()[0].name: torch.rand(2,3,448,224,).detach().numpy(),
}
ort_outputs = ort_session.run(None, ort_inputs)
Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Split node. Name:'/Split' Status Message: Cannot split using values in 'split' attribute. Axis=0 Input shape={2,3,448,224} NumOutputs=4 Num entries in 'split' (must equal number of outputs) was 4 Sum of sizes in 'split' (must equal size of selected axis) was 4
Above is a minimal example that fails. When images with same batch size as sample input are used at inference, it does not fail. What causes the error?
Versions
PyTorch version: 2.6.0+cu126 Is debug build: False CUDA used to build PyTorch: 12.6 ROCM used to build PyTorch: N/A
OS: Microsoft Windows 11 Pro (10.0.22631 64비트) GCC version: Could not collect Clang version: Could not collect CMake version: version 3.31.5 Libc version: N/A
Python version: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22631-SP0 Is CUDA available: True CUDA runtime version: 12.8.61 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4080 Nvidia driver version: 571.96 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Name: 13th Gen Intel(R) Core(TM) i7-13700KF Manufacturer: GenuineIntel Family: 198 Architecture: 9 ProcessorType: 3 DeviceID: CPU0 CurrentClockSpeed: 3400 MaxClockSpeed: 3400 L2CacheSize: 24576 L2CacheSpeed: None Revision: None
Versions of relevant libraries: [pip3] numpy==2.2.2 [pip3] onnx==1.17.0 [pip3] onnxruntime-gpu==1.20.1 [pip3] onnxscript==0.2.0 [pip3] onnxsim==0.4.36 [pip3] pytorch-lightning==2.5.0.post0 [pip3] torch==2.6.0+cu126 [pip3] torchmetrics==1.6.1 [pip3] torchvision==0.21.0+cu126 [conda] Could not collect
Hi @davidgill97 , sorry, I don't think I'll be able to prioritize ONNX-related issues from now.
Hi @davidgill97 , sorry, I don't think I'll be able to prioritize ONNX-related issues from now.
I see. I'm willing to look into the problem, could you give me some advice?