coremltools
coremltools copied to clipboard
Possible to use spec to break large model into smaller model?
❓Question
I have a large converted model that I would like to run on a phone. The model takes up too much memory right now. So at the expense of some performance, I was wanting to break the model up into its layers, and load them individually.
So if I have the spec of the .mlpackage, can I retrieve individual layers and same them piece by piece? I'm not working with a pipeline, unfortunately.
There is no good or easy way to do this. It's possible to get the spec from your MLModel
. Then edit that protobuf to split that model into multiple models. However that is going to be cumbersome. It will probably be significantly easier to split your model into multiple models (using TensorFlow or PyTorch) then convert each of those models separately to Core ML.
You said your model is using too much memory, have you considered compressing your weights?
You know, I did compress the weights, but everything still balloons to 8 GB. So it's more the operation going on inside the model, building up a ton of tensors. So yeah, looks like I'll need to break up the model.
Unless, is there a way to automatically convert every float32 to float16? The weights seem to already be stored as float16, but is there a way for the operations themselves to happen in float16?
Is your model of type neural network
or mlprogram
?
For neural network
model type, compute on both GPU and NE, happens in FP16 only.
For mlprogram
model type, while the model is being converted, every tensor is converted from FP32 to FP16 on the default path.
Thanks @DawerG It's an mlprogram.
Thanks y'all, looks like breaking up the model is the way to go. Appreciate the insight!
Would you mind sharing how you went about this, @MatthewWaller?
@Lukas1h hey sure thing! I'm still working on breaking it down to the point that I'm able to use the ANE, but I intend to have a massive blog post with code examples once we're ready to go.
Thanks! That would be awesome!
Side note, does you implementation of SD on iOS run on devices with 3gb of memory?
@Lukas1h it does! It's not the fastest yet, but it sure is efficient
That's awesome news. Looking forward to the blog post!