DPT icon indicating copy to clipboard operation
DPT copied to clipboard

Can dpt models be traced?

Open 3togo opened this issue 3 years ago • 20 comments

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below. Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other) /mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. gs_old = int(math.sqrt(len(posemb_grid))) /usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. warnings.warn( Traceback (most recent call last): File "/mnt/data/git/DPT/export_model.py", line 112, in convert(in_model_path, out_model_path) File "/mnt/data/git/DPT/export_model.py", line 64, in convert sm = torch.jit.trace(model, example_input) File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace return trace_module( File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module module._c._create_method_from_trace( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward inv_depth = super().forward(x).squeeze(dim=1) File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit nn.Unflatten( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init self._require_tuple_int(unflattened_size) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int raise TypeError("unflattened_size must be tuple of ints, " + TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

3togo avatar Jul 27 '21 04:07 3togo

The current model isn't traceable unfortunately. As this is a rather popular request (see also https://github.com/isl-org/MiDaS/issues/122) we are working on a rewrite to fix this.

ranftlr avatar Jul 29 '21 09:07 ranftlr

ranftlr,

many thanks for your prompt reply.

eli

3togo avatar Jul 30 '21 04:07 3togo

I just pushed a preview of a scriptable and traceable model to branch "dpt_scriptable": https://github.com/isl-org/DPT/tree/dpt_scriptable. Note that you have to download updated weight files for this to work. You can find updated links in the README of the branch.

Please let us know if this solves your problem or if you experience any issues with this.

ranftlr avatar Aug 05 '21 15:08 ranftlr

@ranftlr Thanks for your works. This code does not work with torch.onnx. Can you see it ? Thanks

phamdat09 avatar Aug 12 '21 14:08 phamdat09

@ranftlr , I try to trace your "dpt_hybrid-midas-d889a10e.pt" using torch.jit.trace but failed

Below is the error message: File "/usr/local/lib/python3.9/dist-packages/torch/_tensor.py", line 867, in unflatten return super(Tensor, self).unflatten(dim, sizes, names) RuntimeError: NYI: Named tensors are not supported with the tracer

errors.txt

3togo avatar Sep 13 '21 02:09 3togo

is there a fix for this yet @ranftlr ? thank you

AbdouSarr avatar Sep 29 '21 09:09 AbdouSarr

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below. Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other) /mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. gs_old = int(math.sqrt(len(posemb_grid))) /usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. warnings.warn( Traceback (most recent call last): File "/mnt/data/git/DPT/export_model.py", line 112, in convert(in_model_path, out_model_path) File "/mnt/data/git/DPT/export_model.py", line 64, in convert sm = torch.jit.trace(model, example_input) File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace return trace_module( File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module module._c._create_method_from_trace( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward inv_depth = super().forward(x).squeeze(dim=1) File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit nn.Unflatten( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init self._require_tuple_int(unflattened_size) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int raise TypeError("unflattened_size must be tuple of ints, " + TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below. Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other) /mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. gs_old = int(math.sqrt(len(posemb_grid))) /usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. warnings.warn( Traceback (most recent call last): File "/mnt/data/git/DPT/export_model.py", line 112, in convert(in_model_path, out_model_path) File "/mnt/data/git/DPT/export_model.py", line 64, in convert sm = torch.jit.trace(model, example_input) File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace return trace_module( File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module module._c._create_method_from_trace( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward inv_depth = super().forward(x).squeeze(dim=1) File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit nn.Unflatten( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init self._require_tuple_int(unflattened_size) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int raise TypeError("unflattened_size must be tuple of ints, " + TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

Hello,I also encountered the same problem. Has this problem been solved, please?

Wing100 avatar Oct 15 '21 14:10 Wing100

Hi, I have been trying to export DPT-Hybrid to onnx today using the dpt_scriptable branch and also encountered RuntimeError: NYI: Named tensors are not supported with the tracer. I found this pytorch issue which looks the same. The problem is the usage of unflatten. I have succesfully exported the onnx model removing these two unflatten calls (vit.py, lines ~320)

layer_3 = self.act_postprocess3(layer_3.unflatten(2, out_size))
layer_4 = self.act_postprocess4(layer_4.unflatten(2, out_size))

and using view instead

x3, y3, z3 = layer_3.shape
layer_3 = self.act_postprocess3(layer_3.view(x3, y3, *out_size))
x4, y4, z4 = layer_4.shape
layer_4 = self.act_postprocess4(layer_4.view(x4, y4, *out_size))

The hybrid model doesn't need to convert layer1 and layer2, but the same solution probably applies.

I will test further and comment back soon.

guillesanbri avatar Oct 15 '21 21:10 guillesanbri

@guillesanbri If there is a pre-trained model trained by others in the network, such as resnet50, can dpt models be traced?

RuntimeError: Error(s) in loading state_dict for net: Missing key(s) in state_dict: Unexpected key(s) in state_dict:

Wing100 avatar Oct 18 '21 14:10 Wing100

@Wing100 I'm not sure what are you referring to, I traced the Hybrid model which has a ResNet50 inside. The error you got seems to be related to loading model parameters from another model without setting strict=False, but afaik that is not related to tracing the model.

guillesanbri avatar Oct 21 '21 15:10 guillesanbri

@guillesanbri Hi, after making changing from unflatten to view, i get the following error:

RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

did you encounter this and/or do you know how to solve this? thanks in advance!

romil611 avatar Nov 16 '21 04:11 romil611

@romil611 I think I got that error when playing with the dynamic axes of onnx export. My use case doesn't need dynamic axes so I have set their size static for now. Will ping you if I get back to that.

guillesanbri avatar Nov 16 '21 08:11 guillesanbri

@guillesanbri I also need static sizes and didn't add the dynamic axes option in the torch.onnx.export call. My guess was that dynamic axis were being used somewhere inside which his causing the issue. If you remember anything related to it then do tell. Anyways, Thanks for the reply!!

romil611 avatar Nov 16 '21 09:11 romil611

@guillesanbri Hi, after making changing from unflatten to view, i get the following error:

RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

did you encounter this and/or do you know how to solve this? thanks in advance!

@romil611 I saw this error when I was calling torch.onnx.export on the scripted version of the model. Make sure you don't have

model = torch.jit.script(model)

anywhere preceding your export call.

For me, the other secret for a successful export (in addition to the edits @guillesanbri has already suggested) was to keep everything on the CPU. According to this comment, the device the model was running on when exported does not affect the resultant onnx model.

ghost avatar Nov 17 '21 17:11 ghost

For me the torch.export worked with the main brach itself when I tried to change unflatten to view.

romil611 avatar Nov 17 '21 18:11 romil611

Thank you for efforts due to so many peoples. The problem is fixed by using the latest version of pytorch.

3togo avatar Nov 20 '21 04:11 3togo

Hi, I have been trying to export DPT-Hybrid to onnx today using the dpt_scriptable branch and also encountered RuntimeError: NYI: Named tensors are not supported with the tracer. I found this pytorch issue which looks the same. The problem is the usage of unflatten. I have succesfully exported the onnx model removing these two unflatten calls (vit.py, lines ~320)

layer_3 = self.act_postprocess3(layer_3.unflatten(2, out_size))
layer_4 = self.act_postprocess4(layer_4.unflatten(2, out_size))

and using view instead

x3, y3, z3 = layer_3.shape
layer_3 = self.act_postprocess3(layer_3.view(x3, y3, *out_size))
x4, y4, z4 = layer_4.shape
layer_4 = self.act_postprocess4(layer_4.view(x4, y4, *out_size))

The hybrid model doesn't need to convert layer1 and layer2, but the same solution probably applies.

I will test further and comment back soon.

I tried to export DPT-Hybrid to onnx today using the dpt_scriptable branch, however encountered with the following issue: image do you know why? it seems a bug in the model returned by timm.create_model("vit_base_resnet50_384", pretrained=pretrained) I tried change x = self.model.patch_embed.backbone(x) to x = self.model.patch_embed.backbone(x.contiguous()) however it doesn't work, do you know what's the problem? thanks ahead!

I solved the above problem by downgrade timm. but I encounterd with another problem: Exporting the operator std_mean to ONNX opset version 12 is not supported. Please open a bug to request ONNX export support for the missing operator. anyone knows how to solve it?

jucic avatar Dec 07 '21 13:12 jucic

@guillesanbri @ranftlr it seems that the converted onnx model can only support input with static size? The patch size cannot be changed if the model is converted to onnx

Tord-Zhang avatar Mar 01 '22 07:03 Tord-Zhang

I got the following errors when I try to trace "dpt_beit_large_384.pt".

Any help?

Traceback (most recent call last):
  File "/work/gitee/MiDaS-cpp/python/export_model.py", line 162, in <module>
    convert(in_model_type, in_model_path, out_model_path)
  File "/work/gitee/MiDaS-cpp/python/export_model.py", line 84, in convert
    sm = torch.jit.trace(model, sample, strict=False)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 794, in trace
    return trace_module(
           ^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 1084, in trace_module
    _check_trace(
  File "/home/eli/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 562, in _check_trace
    raise TracingCheckError(*diag_info)
torch.jit._trace.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!
	Graph diff:
		  graph(%self.1 : __torch__.midas.dpt_depth.DPTDepthModel,
		        %x.1 : Tensor):
		    %scratch : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %output_conv : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="output_conv"](%scratch)
		    %scratch.15 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet1 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet1"](%scratch.15)
		    %scratch.13 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet2 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet2"](%scratch.13)
		    %scratch.11 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet3 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet3"](%scratch.11)
		    %scratch.9 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet4 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet4"](%scratch.9)
		    %scratch.7 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer4_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer4_rn"](%scratch.7)
		    %scratch.5 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer3_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer3_rn"](%scratch.5)
		    %scratch.3 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer2_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer2_rn"](%scratch.3)
		    %scratch.1 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer1_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer1_rn"](%scratch.1)
		    %pretrained : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess4 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess4"](%pretrained)
		    %_4.7 : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="4"](%act_postprocess4)
		    %pretrained.83 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess4.5 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess4"](%pretrained.83)
		    %_3.9 : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="3"](%act_postprocess4.5)
		    %pretrained.81 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess3 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess3"](%pretrained.81)

3togo avatar Mar 22 '23 09:03 3togo

https://github.com/isl-org/MiDaS/issues/189 I can verify that dpt_large_384.pt in MiDaS v3.1 can be traced using torch.jit.trace, but I cannot export the model to ONNX. I'm receiving RuntimeError: Input type (float) and bias type (c10::Half) should be the same. Has anyone had any experience exporting the latest models to ONNX?

foemre avatar Apr 10 '23 12:04 foemre