spox icon indicating copy to clipboard operation
spox copied to clipboard

Improve performance of list-attribute creation

Open JakubBachurskiQC opened this issue 2 years ago • 5 comments

Currently, attributes are built (i.e. passed to onnx.helper.make_attribute) 3-4 times, on:

  • Validation when initializating
  • Running type inference (1.) and possibly value propagation (2.)
  • Building

For very large attributes (like constant parameters, decision trees, label encoder maps, ...) this becomes significant. If we came up with a way to do this only once or twice, we'd be running this part ~2x faster.

Part of the issue is also due to the upstream onnx.helper.make_attribute being rather inefficient, which I can try to upstream a fix for. Based on early experiments it could be improved by as much as 3x.

JakubBachurskiQC avatar Feb 27 '23 13:02 JakubBachurskiQC

Though we could squeeze out much more, I think that https://github.com/onnx/onnx/pull/5063 should be enough. It achieves a 5-8x improvement in the common case (list[int] | list[float]).

jbachurski avatar Mar 28 '23 22:03 jbachurski

Thanks for your work on this topic, @jbachurski. It currently takes roughly 5 minutes to convert a Python-based ratebook to ONNX. Most of the time is spent creating list attributes (we store several lookup tables with 100,000+ entries).

JanJaguschQC avatar Mar 30 '23 06:03 JanJaguschQC

When we get https://github.com/onnx/onnx/pull/5220 in a new ONNX release, we can set the new attr_type argument in make_attribute to avoid the attribute type inference code altogether. This would further improve performance. The possibility of caching the AttributeProto while the node is getting constructed (checking + type inference) still stands.

jbachurski avatar Jun 10 '23 16:06 jbachurski

@aivanoved Is this the issue you were referring to last week?

cbourjau avatar Oct 05 '23 15:10 cbourjau

I really can't pinpoint where the issue comes from at this stage, what I can do confirm is that for not so large models the model creation time including creation of the spox objects, the model and the build of the graph takes what seems to be wuite a long time

aivanoved avatar Oct 09 '23 12:10 aivanoved

The main slow-down was solved upstream with the attr_type attribute to make_attribute. I don't think there is any need for further improvements here at this point.

cbourjau avatar Jun 18 '24 12:06 cbourjau