whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Crash on iPhone when Using CoreML

Open leohuang2013 opened this issue 1 year ago • 1 comments

Followed instruction in README to convert coreml model. Then tested on macOS, it works perfectly.

Then copy the model to SwiftUI example project, followed by adding WHISPER_USE_COREML processor and coreml source code. Then compile and run on device, it crashes with error: Failure Reason: Message from debugger: Terminated due to memory issue

Debugger output:

whisper_init_from_file_no_state: loading model from '/private/var/containers/Bundle/Application/50F8C5E6-0550-4A36-AA0F-681BAE0531E6/whisper.swiftui.app/models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 2
whisper_model_load: mem required  =  218.00 MB (+    6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =  140.60 MB
whisper_model_load: model size    =  140.54 MB
whisper_init_state: kv self size  =    5.25 MB
whisper_init_state: kv cross size =   17.58 MB
whisper_init_state: loading Core ML model from '/private/var/containers/Bundle/Application/50F8C5E6-0550-4A36-AA0F-681BAE0531E6/whisper.swiftui.app/models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
2023-04-17 09:14:47.251839+0800 whisper.swiftui[28767:2090946] Metal API Validation Enabled

If run this project on macOS, it works.

leohuang2013 avatar Apr 17 '23 01:04 leohuang2013

For "small" anything with less memory than an iPhone 12 will likely crash. iPhone 12 is marginal and it helps if you close all other apps.

bjnortier avatar Apr 17 '23 13:04 bjnortier

Thanks @bjnortier for quick reply. I previously used the code from commit: 09e90680072d8ecdf02eaf21c393218385d2c616

It works perfectly on same iPhone device. Does this means there is much more memory usage since above commit? Is it possible we use same level of memory for CoreML?

leohuang2013 avatar Apr 18 '23 02:04 leohuang2013

When you load a CoreML model it is optimised on the device, hence the "first run on a device may take a while ..." output. Afaik this is an internal operation can cannot be pre-computed (e.g. cannot be optimised on another iPhone and then copied over).

This process requires a lot of memory. So if you compile with CoreML, when the model loads for the first time it will consume a lot of memory and might crash, where before it wouldn't for the same iPhone.

I don't understand the question "Is it possible we use same level of memory for CoreML?"

bjnortier avatar Apr 18 '23 08:04 bjnortier

"When you load a CoreML model it is optimised on the device" - is model optimized saved to local storage, or it is in memory? If answer is the latter one, then every time, I restart app, then it will do optimization again.

"Is it possible we use same level of memory for CoreML?" What I mean is, if normal memory usage for whisper-ggml mode loading is 300+MB, then can we do CoreML model loading with 300+MB also.

If it is impossible, what approximate memory usage for CoreML model loading/optimization?

leohuang2013 avatar Apr 20 '23 00:04 leohuang2013