exporters Support for smaller quantization, 8 or 4 at least

Support for smaller quantization, 8 or 4 at least

Open Proryanator opened this issue 10 months ago • 1 comments

This tool is amazing, having tried scripting using the coreml library by hand, running into all kinds of fun issues, then trying this and it all being orchestrated/abstracted for you, this is excellent 👏

I noticed that there's only quantization support for down to 16 bits however, and would love to have smaller options. I do believe CoreML is capable of these so it may just be adding that call to this wrapper.

I did look in convert.py and I do see a flag use_legacy_format being checked before performing quantize 16, is there something different with how the ML Program handles or does lower bit quantization?

Mar 28 '24 23:03 Proryanator

I realized that you can still quantize a coreml model after it's been made, can probably disregard this issue. Will try quantizing some existing coreml models I found.

So having this tool convert it, then doing further quantizing after should work!

Mar 31 '24 00:03 Proryanator

exporters exporters copied to clipboard

Support for smaller quantization, 8 or 4 at least

exporters
exporters copied to clipboard