refactor: use simplify when dynamic is true
π οΈ PR Summary
Made with β€οΈ by Ultralytics Actions
π Summary
Minor test improvements for ONNX export and simplification logic. π§ͺβ¨
π Key Changes
- Removed a test condition that skipped ONNX export tests when both
simplifyanddynamicwere enabled. - Updated the ONNX export test to always use
simplify=True.
π― Purpose & Impact
- Ensures more thorough testing of ONNX export with model simplification, even in dynamic scenarios.
- Helps catch potential issues earlier, leading to more robust export functionality for users.
- No changes to user-facing features; these are internal test improvements.
π Hello @inisis, thank you for submitting a PR to ultralytics/ultralytics π! This is an automated response to help guide your contribution. An Ultralytics engineer will review and assist you soon.
Please review this checklist to ensure your PR is ready for integration:
- β Define a Purpose: Make sure your PR description clearly explains the goal of your changes and links to any relevant issues. Please keep your commit messages clear and consistent.
- β
Synchronize with Source: Verify your branch is up to date with the
mainbranch ofultralytics/ultralytics. If needed, use the 'Update branch' button or rungit pullandgit merge mainlocally. - β Ensure CI Checks Pass: Confirm all Ultralytics Continuous Integration (CI) checks are passing. Address any failing checks as necessary.
- β Update Documentation: If your changes affect documentation, remember to update the docs accordingly.
- β Add Tests: For new or updated features, ensure you have provided or updated relevant tests and that all tests pass.
- β Sign the CLA: If this is your first PR to Ultralytics, please sign our Contributor License Agreement (CLA) by commenting "I have read the CLA Document and I sign the CLA" below.
- β Minimize Changes: Limit your PR to the minimum necessary for your update or fix. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." β Bruce Lee
For additional details, please see our Contributing Guide. If you have any questions, feel free to ask here. Thank you for helping improve Ultralytics! ππ§ͺ
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 74.12%. Comparing base (
dd798f8) to head (0cfa1be). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #20569 +/- ##
==========================================
+ Coverage 74.06% 74.12% +0.06%
==========================================
Files 144 144
Lines 19230 19230
==========================================
+ Hits 14242 14254 +12
+ Misses 4988 4976 -12
| Flag | Coverage Ξ | |
|---|---|---|
| Benchmarks | 31.18% <100.00%> (+0.06%) |
:arrow_up: |
| GPU | 36.69% <0.00%> (ΓΈ) |
|
| Tests | 68.20% <100.00%> (+0.03%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@glenn-jocher @Y-T-G It's a little bit weird that the ci still uses onnxslim 0.1.51, seems that the env had been cached, because 0.1.52 was released.
PyTorch: starting from 'yolo11n-seg.pt' with input shape (1, 3, 32, 32) BCHW and output shape(s) ((1, 21, 38), (1, 32, 8, 8)) (5.9 MB)
ONNX: starting export with onnx 1.17.0 opset 19...
ONNX: slimming with onnxslim 0.1.51...
ONNX: export success β
52.6s, saved as 'yolo11n-seg.onnx' (11.1 MB)
Yes, it's cached
0.1.51
============================= slowest 30 durations =============================
74.42s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
72.74s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
51.65s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
51.20s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
49.90s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-False]
49.51s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-False]
37.55s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
37.40s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
36.41s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
35.91s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
35.79s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]
35.70s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-False]
35.65s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-True]
35.53s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-False]
35.00s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-False]
34.88s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-False]
27.53s call tests/test_cuda.py::test_train
8.38s call tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-True-True]
5.95s call tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-1-True-True]
4.69s call tests/test_cuda.py::test_amp
4.19s call tests/test_cuda.py::test_predict_sam
4.18s call tests/test_cuda.py::test_autobatch
3.90s call tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-2-True-True]
3.65s call tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-True]
3.61s call tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-1-True-False]
3.35s call tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-1-True-True]
3.24s call tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-1-True-True]
3.17s call tests/test_cuda.py::test_export_onnx_matrix[detect-False-False-False-2-True-True]
3.11s call tests/test_cuda.py::test_export_onnx_matrix[detect-False-False-False-1-True-True]
2.40s call tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-2-True-False]
======================== 77 passed in 865.51s (0:14:[25](https://github.com/ultralytics/ultralytics/actions/runs/14814481283/job/41593280930#step:6:26)) ========================
0.1.52
============================= slowest 30 durations =============================
40.68s call tests/test_cuda.py::test_train
20.92s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-True]
20.55s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
19.24s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
18.85s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-True]
13.55s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
13.44s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-True]
13.37s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
13.11s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-True]
12.60s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-False]
12.40s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-False]
12.21s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-False]
12.14s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-False]
10.50s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
10.41s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
10.20s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-True]
9.84s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-True]
9.76s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-False-True]
9.49s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
9.35s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-False]
9.19s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-False]
9.04s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-False]
8.96s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
8.92s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-False]
8.81s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-False-True]
8.74s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-True]
8.70s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]
8.66s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-False]
8.63s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-False]
8.39s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-1-True-False]
simplify=False
============================= slowest 30 durations =============================
40.09s call tests/test_cuda.py::test_train
6.47s call tests/test_cuda.py::test_predict_sam
4.56s call tests/test_cuda.py::test_amp
3.56s call tests/test_cuda.py::test_autobatch
3.34s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-True]
2.19s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-True]
2.18s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-True-True]
2.07s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-True-True]
2.05s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-True]
2.00s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-True-True]
1.99s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-1-False-True]
1.88s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-2-False-True]
1.82s call tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-1-False-True]
1.74s call tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-False]
1.73s call tests/test_cuda.py::test_export_onnx_matrix[classify-True-False-False-1-True-False]
1.73s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-False]
1.70s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-True-True]
1.70s call tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-True-True]
1.69s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-2-False-True]
1.68s call tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-2-False-True]
1.67s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-True-True]
1.65s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-False-True]
1.64s call tests/test_cuda.py::test_export_onnx_matrix[detect-True-False-False-2-False-True]
1.61s call tests/test_cuda.py::test_export_onnx_matrix[pose-False-False-False-2-False-True]
1.60s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-False-True]
1.60s call tests/test_cuda.py::test_export_onnx_matrix[pose-True-False-False-1-True-False]
1.59s call tests/test_cuda.py::test_export_onnx_matrix[segment-False-False-False-2-True-True]
1.59s call tests/test_cuda.py::test_export_onnx_matrix[obb-False-False-False-1-True-True]
1.59s call tests/test_cuda.py::test_export_onnx_matrix[segment-True-False-False-2-False-False]
1.58s call tests/test_cuda.py::test_export_onnx_matrix[obb-True-False-False-1-True-False]
Thanks for sharing these detailed benchmarks, @inisis. The performance improvement with onnxslim 0.1.52 is quite impressive - cutting export times by around 70% compared to 0.1.51. While simplify=False is even faster, the simplification process provides important optimizations that make the exported models more efficient during inference.
This PR makes sense to enable simplification for dynamic models as well, giving users the best of both worlds - dynamic input sizes and optimized model structure. We'll update our CI environment with the latest onnxslim version in a future maintenance update.
@glenn-jocher There is a good news that onnxslim will be merged into optimum very soon
@inisis updated to onnxslim 0.1.53 and checked passing tests, merging now!
@inisis oh no, it would be much better to continue directly as a standalone onnxslim package as right now you have almost no dependencies so the package is much more attractive than a huge package weighed down with extra dependencies the users don't want installed in their environments.
@inisis oh no, it would be much better to continue directly as a standalone
onnxslimpackage as right now you have almost no dependencies so the package is much more attractive than a huge package weighed down with extra dependencies the users don't want installed in their environments.
no, onnxslim will still be a standalone project, and will be merged in the way ultralytics once did.
Oh I see, that is good news then :)
Users are seeing the value in the package!
π Fantastic work, @inisis, with valuable contributions from @glenn-jocher and @Y-T-G! Your updates to ONNX export reliability and test clarity are a major step forward for the Ultralytics ecosystem. As Albert Einstein once said, "Strive not to be a success, but rather to be of value." Your efforts truly exemplify this, bringing greater stability, transparency, and ease of maintenance to our export and testing workflows. Thank you for your dedication to making Ultralytics HUB and YOLO11 even stronger! π