whisper-web
whisper-web copied to clipboard
[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models
Hello,
I’m using whisper-web (experimental-webgpu branch) with local models, (env.allowLocalModels = true and env.localModelPath = "./models"), and facing challenges in setting distinct dtype values for encoder_model and decoder_model_merged with a - small model.
The error I see -
Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.
Is there a specific convention for key names or values when setting dtype for encoder/decoder precision levels (according to the models ONNX files?
const transcriber = await pipeline(
"automatic-speech-recognition",
"my-whisper-model",
{
dtype: {
encoder_model: "fp32",
decoder_model_merged: "q4"
},
device: "webgpu"
}
);