sam-hq
sam-hq copied to clipboard
onnx export script does not support vit_tiny?
any updates on this?
I was able to export it by modifying the export script in two places:
Add a "vit_tiny": 160
entry to the encoder_embed_dim_dict
dictionary here: https://github.com/SysCV/sam-hq/blob/322488826bda616798901c6280d13a9a90444ae7/scripts/export_onnx_model.py#L142
encoder_embed_dim_dict = {"vit_tiny":160,"vit_b":768,"vit_l":1024,"vit_h":1280}
And modify the interm_embeddings
dummy input to have 1 instead of 4 as the first dimension here: https://github.com/SysCV/sam-hq/blob/322488826bda616798901c6280d13a9a90444ae7/scripts/export_onnx_model.py#L148
"interm_embeddings": torch.randn(1, 1, *embed_size, encoder_embed_dim, dtype=torch.float),
Then specify --model-type vit_tiny
while calling the script.
This is my experimental result, obtained through trial and error, without understanding the internals fully.
Please modify the line vit_features = interm_embeddings[0].permute(0, 3, 1, 2)
in the file segment_anything/utils/onnx.py
by removing the array indexing, and of course, the output of the encoding model should also be adjusted accordingly. Otherwise, an error will occur in the webgpu environment. This is likely a bug with onnxruntime. Change it to the following:
vit_features = interm_embeddings.permute(0, 3, 1, 2)