burn
burn copied to clipboard
Constant tensor support in burn-import
Pull Request Template
Checklist
- [ ] Confirmed that
run-checks allscript has been executed. - [ ] Made sure the book is up to date with changes in this PR.
Related Issues/PRs
- #1915
- #1882
- #1560
- #1593
Changes
Fixes 2 issues:
- In older models, Initializers might be used in place of constants, so those inputs need to be renamed to avoid generating invalid code. This either renames all initializers that are not inputs (commented out) or ~~checks for initializers in nodes known to be a problem(currently
AddandMul)~~. EDIT: have to go with option 1 because initializers might be reused, which means tracking name changes - So far, all constants have been scalars or vectors, which aren't checked during scoping. This causes scope checking to panic during the backward pass since those inputs were never registered as a graph input or node output. Right now, the fix is to simply register those inputs as a constant during the pass over inputs
this makes a slight non-fix change to scopes, changing variables value from Vec<TensorVariable> to TensorVariable given there is only one variable pushed to each entry. It also registers outputs and checks inputs in the same iteration, which works given the graph is guaranteed to be a DAG.
Unresolved questions
- Right now the fix for the scope panic is to register any unregistered tensors during the backward pass over inputs. This might be buggy. For example what if the model file is simply invalid and references a variable that simply doesn't exist (~~which would make it through onnx-ir~~)? Should we try to detect and notify the user their model is faulty? EDIT: it would fail during TensorType conversion at the latest)
- the alternative is to track and pass in constants/initializers separately from onnx-ir. This would be a bit more work but might be the option we want to go with if it turns out we can't track constants on the fly during scoping.
- Lastly, how should constants be generated? As module level global statics? Made an attribute of the model struct and declared during
Model::new()? Should we rework constant node type to support Tensors or treat them as a separate beast altogether?
Testing
we need to add a model to onnx test, but generating one for issue one might be hard. Could we just add a legacy folder to throw in older models we can't generate directly?
Codecov Report
Attention: Patch coverage is 34.63415% with 134 lines in your changes missing coverage. Please review.
Project coverage is 85.96%. Comparing base (
c94e743) to head (77552f5). Report is 20 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2008 +/- ##
==========================================
- Coverage 86.08% 85.96% -0.13%
==========================================
Files 695 695
Lines 89049 89203 +154
==========================================
+ Hits 76656 76680 +24
- Misses 12393 12523 +130
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I think we should make "an attribute of the model struct and declared during Model::new()". Record loading happens at the module level and we plan to support multi/sub graphs as separate standalone models.
Regarding testing. Yeah I wish we could store legacy ONNX files but there are a few issues: 1) License and size (which might slow down testing).
However, we can probably use a quick python script that deletes other nodes. I think this is supported by onnx package. So we could instruct to remove everything else except a portion and randomize weights (so no copying).
This PR has been marked as stale because it has not been updated for over a month
I have reviewed the PR and believe a simpler and more scalable solution would be to extract inputs with initializers for node types such as Sub, Add, etc., as Constants. The current solution in the PR introduces a new concept of constants, which already exists in burn-import under the Constant node type (a working example can be found in the Sub ONNX test: link).
The solution should follow a similar approach to this script that I use to extract Sub/Add initializers to constants: link. It's essentially the reverse process of constant lifting that we use for Conv and other nodes.
Having said that I think it'd be easier to start with a new PR than making changes.