spox Improve performance of list-attribute creation

Currently, attributes are built (i.e. passed to onnx.helper.make_attribute) 3-4 times, on:

Validation when initializating
Running type inference (1.) and possibly value propagation (2.)
Building

For very large attributes (like constant parameters, decision trees, label encoder maps, ...) this becomes significant. If we came up with a way to do this only once or twice, we'd be running this part ~2x faster.

Part of the issue is also due to the upstream onnx.helper.make_attribute being rather inefficient, which I can try to upstream a fix for. Based on early experiments it could be improved by as much as 3x.

Feb 27 '23 13:02 JakubBachurskiQC

Though we could squeeze out much more, I think that https://github.com/onnx/onnx/pull/5063 should be enough. It achieves a 5-8x improvement in the common case (list[int] | list[float]).

Mar 28 '23 22:03 jbachurski

Thanks for your work on this topic, @jbachurski. It currently takes roughly 5 minutes to convert a Python-based ratebook to ONNX. Most of the time is spent creating list attributes (we store several lookup tables with 100,000+ entries).

Mar 30 '23 06:03 JanJaguschQC

When we get https://github.com/onnx/onnx/pull/5220 in a new ONNX release, we can set the new attr_type argument in make_attribute to avoid the attribute type inference code altogether. This would further improve performance. The possibility of caching the AttributeProto while the node is getting constructed (checking + type inference) still stands.

Jun 10 '23 16:06 jbachurski

@aivanoved Is this the issue you were referring to last week?

Oct 05 '23 15:10 cbourjau

I really can't pinpoint where the issue comes from at this stage, what I can do confirm is that for not so large models the model creation time including creation of the spox objects, the model and the build of the graph takes what seems to be wuite a long time

Oct 09 '23 12:10 aivanoved

The main slow-down was solved upstream with the attr_type attribute to make_attribute. I don't think there is any need for further improvements here at this point.

Jun 18 '24 12:06 cbourjau