coremltools Possible to use spec to break large model into smaller model?

❓Question

I have a large converted model that I would like to run on a phone. The model takes up too much memory right now. So at the expense of some performance, I was wanting to break the model up into its layers, and load them individually.

So if I have the spec of the .mlpackage, can I retrieve individual layers and same them piece by piece? I'm not working with a pipeline, unfortunately.

Sep 19 '22 17:09 MatthewWaller

There is no good or easy way to do this. It's possible to get the spec from your MLModel. Then edit that protobuf to split that model into multiple models. However that is going to be cumbersome. It will probably be significantly easier to split your model into multiple models (using TensorFlow or PyTorch) then convert each of those models separately to Core ML.

You said your model is using too much memory, have you considered compressing your weights?

Sep 20 '22 15:09 TobyRoseman

You know, I did compress the weights, but everything still balloons to 8 GB. So it's more the operation going on inside the model, building up a ton of tensors. So yeah, looks like I'll need to break up the model.

Unless, is there a way to automatically convert every float32 to float16? The weights seem to already be stored as float16, but is there a way for the operations themselves to happen in float16?

Sep 20 '22 15:09 MatthewWaller

Is your model of type neural network or mlprogram?

For neural network model type, compute on both GPU and NE, happens in FP16 only.

For mlprogram model type, while the model is being converted, every tensor is converted from FP32 to FP16 on the default path.

Sep 20 '22 17:09 DawerG

Thanks @DawerG It's an mlprogram.

Sep 20 '22 19:09 MatthewWaller

Thanks y'all, looks like breaking up the model is the way to go. Appreciate the insight!

Sep 26 '22 15:09 MatthewWaller

Would you mind sharing how you went about this, @MatthewWaller?

Oct 12 '22 16:10 Lukas1h

@Lukas1h hey sure thing! I'm still working on breaking it down to the point that I'm able to use the ANE, but I intend to have a massive blog post with code examples once we're ready to go.

Oct 12 '22 17:10 MatthewWaller

Thanks! That would be awesome!

Side note, does you implementation of SD on iOS run on devices with 3gb of memory?

Oct 12 '22 17:10 Lukas1h

@Lukas1h it does! It's not the fastest yet, but it sure is efficient

Oct 12 '22 17:10 MatthewWaller

That's awesome news. Looking forward to the blog post!

Oct 12 '22 17:10 Lukas1h

coremltools coremltools copied to clipboard

Possible to use spec to break large model into smaller model?

❓Question

coremltools
coremltools copied to clipboard