whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Total Memory Use

Open danhalliday opened this issue 1 year ago • 0 comments

First of all, this is a wonderful project and lots of fun to play with — thanks for all the hard work that went into it!

I’m wondering about what decides total memory use, particularly on iPhones. At the moment this implementation works well on iOS as seen in the Objective-C example. But the memory use (> 500MB for the base model) is obviously on the high side for anything but professional apps (ie. apps like Photoshop, more likely to be running on iPad anyway, where users will use them for a long period for specific pieces of work, and be more forgiving if all their other apps get terminated by the system).

On a high level, what are the constraints on total memory usage? Is it basically a fixed quantity relating to the work the encoder has to do? Is there any prospect of it coming down much in future, using quantisation or other techniques? Would future use of GPUs (or perhaps even the Apple Neural Engine) reduce the memory requirement, or would that only relate to a speedup in processing time? I’m really just trying to get a rough idea of what levers exist to be pulled, if any.

Thanks again!

danhalliday avatar Dec 14 '22 16:12 danhalliday