whisper.cpp Separation of transformation state and model context

Separation of transformation state and model context

Open sandrohanea opened this issue 1 year ago • 2 comments

Currently, in order to run multiple transformations in parallel, you'll need to instantiate multiple whisper_contexts because the state of the transformation (e.g mel spectogram, previous prompts, prtial results) is stored on the same context where the model, vocabulary, etc. are stored.

Is there something which I'm missing why this cannot be separated (to have the context and the state as different entities).

The context will be immutable in this case and can be used by multiple transformations without extra allocation.

What do you think @ggerganov ?

Feb 05 '23 19:02 sandrohanea

Yes - you are correct. It seems I didn't design the whisper_context in the best possible way.

For now, you can achieve parallel processing using a single whisper_context by hacking whisper.cpp directly and doing something similar to what has been done in whisper_full_parallel():

https://github.com/ggerganov/whisper.cpp/blob/2b85be14d8f3e8887623ed76b26b27dfccad7916/whisper.cpp#L4207

But it would be much better if the C API allowed to do this straight from the user code. Will think about adding support for this in the future.

Feb 08 '23 18:02 ggerganov

I didn't know about this whisper_full_parallel()

Thanks for pointing it out, will take a look in the meantime :)

Not sure if I can do the enhancement (it's been a while since I coded a little more in C++), but I will try this weekend (to separate the state and context).

Feb 08 '23 18:02 sandrohanea

whisper.cpp whisper.cpp copied to clipboard

Separation of transformation state and model context

whisper.cpp
whisper.cpp copied to clipboard