optimum icon indicating copy to clipboard operation
optimum copied to clipboard

ORTOptimizer for the model type Segformer

Open zachmayer opened this issue 10 months ago • 5 comments

What does this PR do?

Adds the segformer model to ORTOptimizer. Based on the advice I got in https://github.com/huggingface/optimum/issues/1761, but I decided to start with segformer.

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you make sure to update the documentation with your changes?
  • [x] Did you write any new necessary tests?

Who can review?

@mht-sharma maybe?

zachmayer avatar Apr 18 '24 22:04 zachmayer

@mht-sharma @IlyasMoutawwakil the optimizer is definitely doing something to the model. I tested it on the vikp/surya_layout segformer, where I exported the original model to onnx, optimized it, and then quantized the optimized model and counted the number of nodes in the graph:

Original model graph: 2900
Optimized model graph: 1263
Quantized model graph: 1709

The optimizer definitely prunes nodes from the graph and the resulting model is faster for inference when I test it.

zachmayer avatar Apr 25 '24 13:04 zachmayer

I tried 3 ways of handling the lists:

  • Sum
  • Max
  • Replace the list with 0 (and let microsoft's optimizer infer based on the graph)

Sum/Max yield pretty similar results, so I went with sum. 0 did not seem to work well.

The tests pass on the PR now, and when I test this optimizer on a real segformer, it definitely makes changes to the model graph.

zachmayer avatar Apr 25 '24 14:04 zachmayer

@mht-sharma @IlyasMoutawwakil what do you think? The tests pass when I run them locally, and the optimizer seems to be able to reduce the size of the model a lot. (Almost 60%)

zachmayer avatar Apr 29 '24 13:04 zachmayer

@zachmayer to visualize the graphs you can use https://netron.app/ I have ran your code on vikp/surya_layout and sum gives the optimizer the wrong hidden_size/num_attention_heads, which are then ignored in favor of the detected values (from last encoder block)

Optimizing model...
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
--num_heads is 16. Detected value is 8. Using detected value.
--hidden_size is 1024. Detected value is 512. Using detected value.

which are the max values (or last in the lists). also comparing these two (simple export vs O2 using max).

  • https://netron.app/?url=https://huggingface.co/IlyasMoutawwakil/segformers/blob/main/onnx_surya/model.onnx
  • https://netron.app/?url=https://huggingface.co/IlyasMoutawwakil/segformers/blob/main/onnx_surya_O2/model.onnx

I don't see any optimizations, @mht-sharma any idea which operators we should be looking for ?

IlyasMoutawwakil avatar Apr 29 '24 14:04 IlyasMoutawwakil

huh. I also tried using 0, which the docs said would infer the number of heads based on the model graph.

I changed from sum to max and the values seem correct now: 8 for num_heads and 512 for hidden_size.

In my testing the optimize model definitely has a smaller graph and faster inference. So the optimizer is doing something to the model.

zachmayer avatar May 01 '24 14:05 zachmayer

@IlyasMoutawwakil I changed the normalized config to use 0 instead of the max. How's it look now?

zachmayer avatar Jun 04 '24 12:06 zachmayer

LGTM ! sorry for the late reply, and thanks @zachmayer

IlyasMoutawwakil avatar Jun 04 '24 13:06 IlyasMoutawwakil

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Awesome! No worries and thank you!

zachmayer avatar Jun 04 '24 14:06 zachmayer

Hmm looks like I have some tests to fix

zachmayer avatar Jun 04 '24 20:06 zachmayer

Looks like at least on test failed because of dependencies which is odd.

Let me rebase off the main branch and see what happens

zachmayer avatar Jun 04 '24 20:06 zachmayer

the dep failures are not from this PR, pypi has some issues on some plateforms. merging 🤗

IlyasMoutawwakil avatar Jun 05 '24 06:06 IlyasMoutawwakil

awesome! Thank you for the feedback and for helping me get this over the line!

zachmayer avatar Jun 05 '24 19:06 zachmayer