ultralytics icon indicating copy to clipboard operation
ultralytics copied to clipboard

refactor: use simplify when dynamic is true

Open inisis opened this issue 11 months ago β€’ 4 comments

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Minor test improvements for ONNX export and simplification logic. πŸ§ͺ✨

πŸ“Š Key Changes

  • Removed a test condition that skipped ONNX export tests when both simplify and dynamic were enabled.
  • Updated the ONNX export test to always use simplify=True.

🎯 Purpose & Impact

  • Ensures more thorough testing of ONNX export with model simplification, even in dynamic scenarios.
  • Helps catch potential issues earlier, leading to more robust export functionality for users.
  • No changes to user-facing features; these are internal test improvements.

inisis avatar May 09 '25 15:05 inisis

πŸ‘‹ Hello @inisis, thank you for submitting a PR to ultralytics/ultralytics πŸš€! This is an automated response to help guide your contribution. An Ultralytics engineer will review and assist you soon.

Please review this checklist to ensure your PR is ready for integration:

  • βœ… Define a Purpose: Make sure your PR description clearly explains the goal of your changes and links to any relevant issues. Please keep your commit messages clear and consistent.
  • βœ… Synchronize with Source: Verify your branch is up to date with the main branch of ultralytics/ultralytics. If needed, use the 'Update branch' button or run git pull and git merge main locally.
  • βœ… Ensure CI Checks Pass: Confirm all Ultralytics Continuous Integration (CI) checks are passing. Address any failing checks as necessary.
  • βœ… Update Documentation: If your changes affect documentation, remember to update the docs accordingly.
  • βœ… Add Tests: For new or updated features, ensure you have provided or updated relevant tests and that all tests pass.
  • βœ… Sign the CLA: If this is your first PR to Ultralytics, please sign our Contributor License Agreement (CLA) by commenting "I have read the CLA Document and I sign the CLA" below.
  • βœ… Minimize Changes: Limit your PR to the minimum necessary for your update or fix. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." β€” Bruce Lee

For additional details, please see our Contributing Guide. If you have any questions, feel free to ask here. Thank you for helping improve Ultralytics! 🌟πŸ§ͺ

UltralyticsAssistant avatar May 09 '25 15:05 UltralyticsAssistant

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 74.12%. Comparing base (dd798f8) to head (0cfa1be). Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #20569      +/-   ##
==========================================
+ Coverage   74.06%   74.12%   +0.06%     
==========================================
  Files         144      144              
  Lines       19230    19230              
==========================================
+ Hits        14242    14254      +12     
+ Misses       4988     4976      -12     
Flag Coverage Ξ”
Benchmarks 31.18% <100.00%> (+0.06%) :arrow_up:
GPU 36.69% <0.00%> (ΓΈ)
Tests 68.20% <100.00%> (+0.03%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 09 '25 15:05 codecov[bot]

@glenn-jocher @Y-T-G It's a little bit weird that the ci still uses onnxslim 0.1.51, seems that the env had been cached, because 0.1.52 was released.

PyTorch: starting from 'yolo11n-seg.pt' with input shape (1, 3, 32, 32) BCHW and output shape(s) ((1, 21, 38), (1, 32, 8, 8)) (5.9 MB)
ONNX: starting export with onnx 1.17.0 opset 19...
ONNX: slimming with onnxslim 0.1.51...
ONNX: export success βœ… 52.6s, saved as 'yolo11n-seg.onnx' (11.1 MB)

inisis avatar May 09 '25 16:05 inisis

Yes, it's cached

Y-T-G avatar May 09 '25 17:05 Y-T-G

0.1.51

============================= slowest 30 durations =============================
74.42s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
72.74s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
51.65s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
51.20s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
49.90s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-False]
49.51s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-False]
37.55s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
37.40s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
36.41s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
35.91s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
35.79s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]
35.70s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-False]
35.65s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-True]
35.53s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-False]
35.00s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-False]
34.88s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-False]
27.53s call     tests/test_cuda.py::test_train
8.38s call     tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-True-True]
5.95s call     tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-1-True-True]
4.69s call     tests/test_cuda.py::test_amp
4.19s call     tests/test_cuda.py::test_predict_sam
4.18s call     tests/test_cuda.py::test_autobatch
3.90s call     tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-2-True-True]
3.65s call     tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-True]
3.61s call     tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-1-True-False]
3.35s call     tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-1-True-True]
3.24s call     tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-1-True-True]
3.17s call     tests/test_cuda.py::test_export_onnx_matrix[detect-False-False-False-2-True-True]
3.11s call     tests/test_cuda.py::test_export_onnx_matrix[detect-False-False-False-1-True-True]
2.40s call     tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-2-True-False]
======================== 77 passed in 865.51s (0:14:[25](https://github.com/ultralytics/ultralytics/actions/runs/14814481283/job/41593280930#step:6:26)) ========================

0.1.52

============================= slowest 30 durations =============================
40.68s call     tests/test_cuda.py::test_train
20.92s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-True]
20.55s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
19.24s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
18.85s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-True]
13.55s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
13.44s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-True]
13.37s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
13.11s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-True]
12.60s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-False]
12.40s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-False]
12.21s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-False]
12.14s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-False]
10.50s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
10.41s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
10.20s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-True]
9.84s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-True]
9.76s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-False-True]
9.49s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
9.35s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-False]
9.19s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-False]
9.04s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-False]
8.96s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
8.92s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-False]
8.81s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-False-True]
8.74s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-True]
8.70s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]
8.66s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-False]
8.63s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-False]
8.39s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-False]

simplify=False

============================= slowest 30 durations =============================
40.09s call     tests/test_cuda.py::test_train
6.47s call     tests/test_cuda.py::test_predict_sam
4.56s call     tests/test_cuda.py::test_amp
3.56s call     tests/test_cuda.py::test_autobatch
3.34s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
2.19s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
2.18s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
2.07s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
2.05s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-True]
2.00s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
1.99s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-True]
1.88s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-True]
1.82s call     tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-1-False-True]
1.74s call     tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-False]
1.73s call     tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-1-True-False]
1.73s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-False]
1.70s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
1.70s call     tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-True-True]
1.69s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-True]
1.68s call     tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-False-True]
1.67s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
1.65s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-True]
1.64s call     tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-False-True]
1.61s call     tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-2-False-True]
1.60s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-True]
1.60s call     tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
1.59s call     tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-True]
1.59s call     tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-1-True-True]
1.59s call     tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-False]
1.58s call     tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]

inisis avatar May 10 '25 03:05 inisis

Thanks for sharing these detailed benchmarks, @inisis. The performance improvement with onnxslim 0.1.52 is quite impressive - cutting export times by around 70% compared to 0.1.51. While simplify=False is even faster, the simplification process provides important optimizations that make the exported models more efficient during inference.

This PR makes sense to enable simplification for dynamic models as well, giving users the best of both worlds - dynamic input sizes and optimized model structure. We'll update our CI environment with the latest onnxslim version in a future maintenance update.

glenn-jocher avatar May 10 '25 04:05 glenn-jocher

@glenn-jocher There is a good news that onnxslim will be merged into optimum very soon

inisis avatar May 16 '25 14:05 inisis

@inisis updated to onnxslim 0.1.53 and checked passing tests, merging now!

glenn-jocher avatar May 16 '25 14:05 glenn-jocher

@inisis oh no, it would be much better to continue directly as a standalone onnxslim package as right now you have almost no dependencies so the package is much more attractive than a huge package weighed down with extra dependencies the users don't want installed in their environments.

glenn-jocher avatar May 16 '25 14:05 glenn-jocher

@inisis oh no, it would be much better to continue directly as a standalone onnxslim package as right now you have almost no dependencies so the package is much more attractive than a huge package weighed down with extra dependencies the users don't want installed in their environments.

no, onnxslim will still be a standalone project, and will be merged in the way ultralytics once did.

inisis avatar May 16 '25 14:05 inisis

Oh I see, that is good news then :)

Users are seeing the value in the package!

glenn-jocher avatar May 16 '25 14:05 glenn-jocher

πŸŽ‰ Fantastic work, @inisis, with valuable contributions from @glenn-jocher and @Y-T-G! Your updates to ONNX export reliability and test clarity are a major step forward for the Ultralytics ecosystem. As Albert Einstein once said, "Strive not to be a success, but rather to be of value." Your efforts truly exemplify this, bringing greater stability, transparency, and ease of maintenance to our export and testing workflows. Thank you for your dedication to making Ultralytics HUB and YOLO11 even stronger! πŸš€

UltralyticsAssistant avatar May 16 '25 14:05 UltralyticsAssistant