optimum
optimum copied to clipboard
Add ONNX export for ViTMatte models
What does this PR do?
As title says :)
I haven't added a test yet, since I couldn't find a tiny random model on the HF hub.
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you make sure to update the documentation with your changes?
- [ ] Did you write any new necessary tests?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Still an issue with shapes other than what the model is exported with:
$ optimum-cli export onnx --model hustvl/vitmatte-small-distinctions-646 o --task image-matting
Framework not specified. Using pt to export to ONNX.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Using framework PyTorch: 2.1.1+cu121
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:118: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if num_channels != self.num_channels:
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:100: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = int(math.sqrt(num_position))
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:101: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size * size != num_position:
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:104: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size != height or size != width:
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:411: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_height > 0 or pad_width > 0:
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if rel_pos.shape[0] != max_rel_dist:
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:167: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
q_coords = torch.arange(q_size)[:, None] * max(k_size / q_size, 1.0)
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:168: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
k_coords = torch.arange(k_size)[None, :] * max(q_size / k_size, 1.0)
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
relative_coords = (q_coords - k_coords) + (k_size - 1) * max(q_size / k_size, 1.0)
/usr/local/python/3.10.13/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:447: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if patch_height > height or patch_width > width:
Post-processing the exported models...
Deduplicating shared (tied) weights...
Validating ONNX model o/model.onnx...
-[✓] ONNX model output names match reference model (alphas)
- Validating ONNX Model output "alphas":
-[✓] (2, 1, 64, 64) matches (2, 1, 64, 64)
-[✓] all values close (atol: 1e-05)
The ONNX export succeeded and the exported model was saved at: o
Running it with input of shape [1, 1, 64, 92] gives:
Error: Non-zero status code returned while running Gather node. Name:'/backbone/encoder/layer.2/attention/Gather_4' Status Message: indices element out of data bounds, idx=7 must be within the inclusive range [-7,6]
(but it works in pytorch)
I've narrowed it down to these 2 lines:
- https://github.com/huggingface/transformers/blob/df5c5c62ae253055336f5bb0828ca8e3e15ab6bd/src/transformers/models/vitdet/modeling_vitdet.py#L153
- https://github.com/huggingface/transformers/blob/df5c5c62ae253055336f5bb0828ca8e3e15ab6bd/src/transformers/models/vitdet/modeling_vitdet.py#L100
cc @NielsRogge
Hi,
Sorry I'm not an ONNX expert so not sure how those lines should be updated. Might be better to ping someone from the Optimum team
@xenova If you add a test with a tiny model we can merge this!
I was looking on the Hub for a tiny random model, but I couldn't find one (so I skipped adding the test). If you'd like, I can add https://huggingface.co/hustvl/vitmatte-small-composition-1k (25M params).
However, there's still an issue with python casts (int(...)
and float(...)
). Is there a recommended way to handle this? For my custom exports (see here), I've basically just overridden some of the python casts to use pytorch (.to(...)
) casts instead.
Will convert to draft while we discuss this.
@xenova Thank you so much for the contribution. Since I'm quite new to ONNX. Can you please give me an example to inference your exported ONNX model here in Python?
Here was what I did:
import onnxruntime
from PIL import Image
import numpy as np
from transformers import VitMatteImageProcessor, VitMatteForImageMatting
import torch
from huggingface_hub import hf_hub_download
ort_sess = onnxruntime.InferenceSession("./model/model.onnx",
providers=['CPUExecutionProvider'])
filepath = hf_hub_download(
repo_id="hf-internal-testing/image-matting-fixtures", filename="image.png", repo_type="dataset"
)
image = Image.open(filepath).convert("RGB")
filepath = hf_hub_download(
repo_id="hf-internal-testing/image-matting-fixtures", filename="trimap.png", repo_type="dataset"
)
trimap = Image.open(filepath).convert("L")
processor = VitMatteImageProcessor.from_pretrained("hustvl/vitmatte-small-composition-1k")
inputs = processor(images=image, trimaps=trimap, return_tensors="pt")
alpha = ort_sess.run(inputs)
However, this gave me:
TypeError: run() missing 1 required positional argument: 'input_feed'
In addition, I wonder if this can be done:
providers=['CPUExecutionProvider'] # change to TensorrtExecutionProvider or CUDAExecutionProvider
Thank you so much for your help!
After digging deeper into this, I possibly found a proper way to inference the exported model. Yet still got error. Here was my snippet:
ort_sess.run(['alphas'], {'pixel_values' : torch.rand(1, 4, 640, 960, dytpe=torch.float32).cpu().numpy()})
However, executing the above code returned:
File /media/my_random_things/python3.8_environment/venv3.8_onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:220, in Session.run(self, output_names, input_feed, run_options)
218 output_names = [output.name for output in self._outputs_meta]
219 try:
--> 220 return self._sess.run(output_names, input_feed, run_options)
221 except C.EPFail as err:
222 if self._enable_fallback:
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gather node. Name:'Gather_720' Status Message: indices element out of data bounds, idx=59 must be within the inclusive range [-7,6]
The inference in JS by @xenova available here seems very straight-to-the-point. However, since I'm not much familiar with JS, I can't understand what he did under the hood, yet I belive there should be a similar and simple way to do it in Python.
Can someone please help me with this? Thank you so much!
@EricLe-dev You were very close! Here's some example code:
import onnxruntime
from PIL import Image
import numpy as np
from transformers import VitMatteImageProcessor
from huggingface_hub import hf_hub_download
ort_sess = onnxruntime.InferenceSession(
"./model.onnx", providers=['CPUExecutionProvider']
)
filepath = hf_hub_download(
repo_id="hf-internal-testing/image-matting-fixtures", filename="image.png", repo_type="dataset"
)
image = Image.open(filepath).convert("RGB")
filepath = hf_hub_download(
repo_id="hf-internal-testing/image-matting-fixtures", filename="trimap.png", repo_type="dataset"
)
trimap = Image.open(filepath).convert("L")
processor = VitMatteImageProcessor.from_pretrained("hustvl/vitmatte-small-composition-1k")
inputs = processor(images=image, trimaps=trimap, return_tensors="pt")
outputs = ort_sess.run(None, {'pixel_values' : inputs['pixel_values'].numpy()})
alphas = outputs[0]
# Visualize
result = Image.fromarray(np.uint8(alphas[0][0] * 255), mode='L')
result
Produces:
@xenova Thank you so much for your answer. I encountered an another issue that I would very appreciate if you can help me. The code snippet you gave me ran perfectly with the model.onnx that you exported available here. However, it could not run properly using my exported model.onnx using exactly the same command as you proposed.
My current system is:
python 3.8.17
torch 1.11.0+cu113
transformers 4.39.2
optimum 1.16.0.dev0 #(installed from your branch)
The different part here was, my system has CUDA 11.3 and I could not install any Pytorch version that is newer than 1.11.0+cu113
With the mentioned system, when I run:
optimum-cli export onnx --model hustvl/vitmatte-small-distinctions-646 o --task image-matting
It would complain that:
ImportError: cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils'
In order to get it running, I did a little trick, which are:
# from if is_torch_less_than_1_11
# to if False:
Doing all of these trick allowed me to run the command to export the model. All of my output was exactly like yours except it has:
Weight deduplication check in the ONNX export requires accelerate. Please install accelerate to run it.
Validating ONNX model o/model.onnx...
-[✓] ONNX model output names match reference model (alphas)
- Validating ONNX Model output "alphas":
-[✓] (2, 1, 64, 64) matches (2, 1, 64, 64)
-[x] values not close enough, max diff: 4.124641418457031e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- alphas: max diff = 4.124641418457031e-05.
The exported model was saved at: o
Inferencing the exported model with your code snippet gave me this error:
InvalidArgument Traceback (most recent call last)
Cell In[2], line 21
18 processor = VitMatteImageProcessor.from_pretrained("hustvl/vitmatte-small-composition-1k")
19 inputs = processor(images=image, trimaps=trimap, return_tensors="pt")
---> 21 outputs = ort_sess.run(None, {'pixel_values' : inputs['pixel_values'].numpy()})
22 alphas = outputs[0]
24 # Visualize
File /media/my_random_things/python3.8_environment/venv3.8_onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:220, in Session.run(self, output_names, input_feed, run_options)
218 output_names = [output.name for output in self._outputs_meta]
219 try:
--> 220 return self._sess.run(output_names, input_feed, run_options)
221 except C.EPFail as err:
222 if self._enable_fallback:
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gather node. Name:'Gather_720' Status Message: indices element out of data bounds, idx=59 must be within the inclusive range [-7,6]
Does it has anything to do with the version of Transformers? or the version of PyTorch (since you are using Torch 2.1.1+cu121 and Python 3.10)?
Can you please share your full pip list?
Your help is much appreciated!
@xenova Update:
After install onnxruntime-gpu==1.14.0, then ran the following command:
optimum-cli export onnx --model hustvl/vitmatte-base-distinctions-646 o --task image-matting
I got exactly the same output as yours, which is:
Post-processing the exported models...
Weight deduplication check in the ONNX export requires accelerate. Please install accelerate to run it.
Validating ONNX model o/model.onnx...
-[✓] ONNX model output names match reference model (alphas)
- Validating ONNX Model output "alphas":
-[✓] (2, 1, 64, 64) matches (2, 1, 64, 64)
-[✓] all values close (atol: 1e-05)
The ONNX export succeeded and the exported model was saved at: o
@EricLe-dev You were very close! Here's some example code:
import onnxruntime from PIL import Image import numpy as np from transformers import VitMatteImageProcessor from huggingface_hub import hf_hub_download ort_sess = onnxruntime.InferenceSession( "./model.onnx", providers=['CPUExecutionProvider'] ) filepath = hf_hub_download( repo_id="hf-internal-testing/image-matting-fixtures", filename="image.png", repo_type="dataset" ) image = Image.open(filepath).convert("RGB") filepath = hf_hub_download( repo_id="hf-internal-testing/image-matting-fixtures", filename="trimap.png", repo_type="dataset" ) trimap = Image.open(filepath).convert("L") processor = VitMatteImageProcessor.from_pretrained("hustvl/vitmatte-small-composition-1k") inputs = processor(images=image, trimaps=trimap, return_tensors="pt") outputs = ort_sess.run(None, {'pixel_values' : inputs['pixel_values'].numpy()}) alphas = outputs[0] # Visualize result = Image.fromarray(np.uint8(alphas[0][0] * 255), mode='L') result
Running this piece of code with your exported model work perfectly. However, running it with my exported model gave this error:
InvalidArgument Traceback (most recent call last)
Cell In[6], line 21
18 processor = VitMatteImageProcessor.from_pretrained("Xenova/vitmatte-base-composition-1k")
19 inputs = processor(images=image, trimaps=trimap, return_tensors="pt")
---> 21 outputs = ort_sess.run(None, {'pixel_values' : inputs['pixel_values'].numpy()})
22 alphas = outputs[0]
24 # Visualize
File /media/wand/research/Research/python3.8_environment/venv3.8_onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:200, in Session.run(self, output_names, input_feed, run_options)
198 output_names = [output.name for output in self._outputs_meta]
199 try:
--> 200 return self._sess.run(output_names, input_feed, run_options)
201 except C.EPFail as err:
202 if self._enable_fallback:
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gather node. Name:'Gather_720' Status Message: indices element out of data bounds, idx=59 must be within the inclusive range [-7,6]
Using Netron, I was able to see the difference between your exported model and my exported model.
Please see the attached images:
I belive that in order to get the code running, you did change some layers in the model. Can you please confirm this and give me some more ideas how to get it work? Thank you so much!
@xenova Update 2: Sorry for keep posting in this thread but the more I dive deeper in this, the more interesting things I found and I can't stop sharing this to you guys. Seems that I was wrong, seems that you did not modify the layers of the model but instead, there were some differences in my environment and yours.
Here is my pip list:
Package Version
------------------ ------------
aiohttp 3.9.3
aiosignal 1.3.1
async-timeout 4.0.3
attrs 23.2.0
certifi 2022.12.7
charset-normalizer 2.1.1
coloredlogs 15.0.1
datasets 2.18.0
dill 0.3.8
evaluate 0.4.1
filelock 3.9.0
flatbuffers 24.3.25
frozenlist 1.4.1
fsspec 2024.2.0
huggingface-hub 0.22.2
humanfriendly 10.0
idna 3.4
Jinja2 3.1.2
MarkupSafe 2.1.3
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.2.1
numpy 1.26.3
onnx 1.16.0
onnxruntime 1.17.1
onnxruntime-gpu 1.14.0
optimum 1.16.0.dev0
packaging 24.0
pandas 2.2.1
pillow 10.2.0
pip 24.0
protobuf 5.26.1
pyarrow 15.0.2
pyarrow-hotfix 0.6
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2023.12.25
requests 2.28.1
responses 0.18.0
safetensors 0.4.2
sentencepiece 0.2.0
setuptools 69.1.0
six 1.16.0
sympy 1.12
tokenizers 0.15.2
torch 2.1.1+cu118
torchaudio 2.1.1+cu118
torchvision 0.16.1+cu118
tqdm 4.66.2
transformers 4.36.0.dev0
triton 2.1.0
typing_extensions 4.8.0
tzdata 2024.1
urllib3 1.26.13
wheel 0.42.0
xxhash 3.4.1
yarl 1.9.4
I'm running the code with
Python 3.10.12
And this is my nvidia-smi, my CUDA version is 11.4 but I could install the Torch 2.1.1+cu118 and it still works so I'm not sure if this is an issue. Sun Mar 31 14:35:14 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03 Driver Version: 470.182.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A |
| 0% 38C P8 26W / 390W | 3107MiB / 24268MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:82:00.0 On | N/A |
| 0% 44C P8 37W / 390W | 4396MiB / 24265MiB | 6% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Exporting the model with this environment gave me a slightly different model (viewed in Netron), yet still missing the Pow and some other layers compared to yours. This indicating that there is something wrong in my env. Can you please share the full environment that you used to export the model? Thank you so much!
As mentioned here, I did need to update some of the transformers modelling code to get the export working. In particular, you need to watch out for warnings that mention converting to a python float/integer, and then those calls to int()
and float()
to .to(torch.int64)
and .to(torch.float32)
, respectively.
@xenova Thank you for your reply. Here was my output:
optimum-cli export onnx --model hustvl/vitmatte-base-distinctions-646 o --task image-matting
`AnnotionFormat` is deprecated and will be removed in v4.38. Please use `transformers.image_utils.AnnotationFormat` instead.
Framework not specified. Using pt to export to ONNX.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Using framework PyTorch: 2.1.1+cu118
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:118: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if num_channels != self.num_channels:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:100: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = int(math.sqrt(num_position))
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:101: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size * size != num_position:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:104: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size != height or size != width:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:411: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_height > 0 or pad_width > 0:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if rel_pos.shape[0] != max_rel_dist:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:167: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
q_coords = torch.arange(q_size)[:, None] * max(k_size / q_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:168: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
k_coords = torch.arange(k_size)[None, :] * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
relative_coords = (q_coords - k_coords) + (k_size - 1) * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:447: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if patch_height > height or patch_width > width:
Post-processing the exported models...
Weight deduplication check in the ONNX export requires accelerate. Please install accelerate to run it.
Validating ONNX model o/model.onnx...
-[✓] ONNX model output names match reference model (alphas)
- Validating ONNX Model output "alphas":
-[✓] (2, 1, 64, 64) matches (2, 1, 64, 64)
-[✓] all values close (atol: 1e-05)
The ONNX export succeeded and the exported model was saved at: o
I believe what you meant was editing the lines mentioned in those warnings to .to(torch.int64)
and .to(torch.float32)
?
For example:
line 100 in modeling_vitdet.py
size = int(math.sqrt(num_position)) # change this to size = math.sqrt(num_position).to(torch.int64)
Does this what you did? Thank you so much!
After diving in thousand lines of code (using vscode, searching for direct Python casts of int(...)
and float(...)
), I could not find any casts call of int()
and float()
that are directly related to transformers/models/vitdet
and transformers/models/vitmatte
.
I was looking on the Hub for a tiny random model, but I couldn't find one (so I skipped adding the test). If you'd like, I can add https://huggingface.co/hustvl/vitmatte-small-composition-1k (25M params).
However, there's still an issue with python casts (
int(...)
andfloat(...)
). Is there a recommended way to handle this? For my custom exports (see here), I've basically just overridden some of the python casts to use pytorch (.to(...)
) casts instead.Will convert to draft while we discuss this.
My export output from is exactly as yours, which are:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:118: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if num_channels != self.num_channels:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:100: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = int(math.sqrt(num_position))
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:101: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size * size != num_position:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:104: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size != height or size != width:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:411: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_height > 0 or pad_width > 0:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if rel_pos.shape[0] != max_rel_dist:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:167: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
q_coords = torch.arange(q_size)[:, None] * max(k_size / q_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:168: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
k_coords = torch.arange(k_size)[None, :] * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
relative_coords = (q_coords - k_coords) + (k_size - 1) * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:447: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if patch_height > height or patch_width > width:
There are warnings in this output, most of them are about converting a tensor to a Python boolean. There are 2 lines of warning mentioned something relating to converting a tensor to a Python float and int
. Which are:
-
transformers/models/vitdet/modeling_vitdet.py:100
size = int(math.sqrt(num_position))
is at type float, it does not really cast any tensor here. -
transformers/models/vitdet/modeling_vitdet.py:153
max_rel_dist = int(2 * max(q_size, k_size) - 1)
the result of2 * max(q_size, k_size) - 1
is not of typetorch.Tensor
. Meaning we don't do anyint(torch.Tensor)
here.
@xenova can you please point me to the right file or line of code that you modified? Thank you a ton!
Let's merge this once https://github.com/huggingface/transformers/pull/30065 is released
question is where the hell do i get the model?