sam-hq icon indicating copy to clipboard operation
sam-hq copied to clipboard

onnx export script does not support vit_tiny?

Open shenjun1994 opened this issue 1 year ago • 3 comments

shenjun1994 avatar Sep 27 '23 10:09 shenjun1994

any updates on this?

Sripriyan avatar Dec 04 '23 11:12 Sripriyan

I was able to export it by modifying the export script in two places:

Add a "vit_tiny": 160 entry to the encoder_embed_dim_dict dictionary here: https://github.com/SysCV/sam-hq/blob/322488826bda616798901c6280d13a9a90444ae7/scripts/export_onnx_model.py#L142

encoder_embed_dim_dict = {"vit_tiny":160,"vit_b":768,"vit_l":1024,"vit_h":1280}

And modify the interm_embeddings dummy input to have 1 instead of 4 as the first dimension here: https://github.com/SysCV/sam-hq/blob/322488826bda616798901c6280d13a9a90444ae7/scripts/export_onnx_model.py#L148

"interm_embeddings": torch.randn(1, 1, *embed_size, encoder_embed_dim, dtype=torch.float),

Then specify --model-type vit_tiny while calling the script.

This is my experimental result, obtained through trial and error, without understanding the internals fully.

atesgoral avatar Feb 05 '24 04:02 atesgoral

Please modify the line vit_features = interm_embeddings[0].permute(0, 3, 1, 2) in the file segment_anything/utils/onnx.py by removing the array indexing, and of course, the output of the encoding model should also be adjusted accordingly. Otherwise, an error will occur in the webgpu environment. This is likely a bug with onnxruntime. Change it to the following: vit_features = interm_embeddings.permute(0, 3, 1, 2)

njms19841 avatar Mar 28 '24 10:03 njms19841