whisper: CrisperWhisper results in grpc: error while marshaling: string field contains invalid UTF-8
LocalAI version: localai/localai:v2.26.0-aio-gpu-nvidia-cuda-12
Environment, CPU architecture, OS, and Version:
uname -a
Linux gpu2 6.13.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 13 Mar 2025 18:12:00 +0000 x86_64 GNU/Linux
Describe the bug
Using https://huggingface.co/nyrahealth/CrisperWhisper with local-ai resullts in
Whisper-Error: 500 - {"error":{"code":500,"message":"rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8","type":""}}
To Reproduce
- create directories and install dependencies
mkdir CrisperWhisper
mkdir CrisperWhisper-out
pip install huggingface_hub torch numpy transformers
git clone https://github.com/openai/whisper
git clone https://github.com/ggerganov/whisper.cpp
- Download the model
from huggingface_hub import snapshot_download, login
HUGGINGFACE_TOKEN = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
login(token=HUGGINGFACE_TOKEN)
model_id = "nyrahealth/CrisperWhisper" # Replace with the ID of the model you want to download
snapshot_download(repo_id=model_id, local_dir="CrisperWhisper")
- convert model to single file ggml
python whisper.cpp/models/convert-h5-to-ggml.py CrisperWhisper/ whisper/ CrisperWhisper-out/
move CrisperWhisper-out/ ggm file to your local-ai model path.
Expected behavior
Transcribe succeeded without any errors.
Logs
7:53AM INF BackendLoader starting backend=whisper modelID=CrisperWhisper.bin o.model=CrisperWhisper.bin
7:54AM INF Success ip=127.0.0.1 latency="28.449µs" method=GET status=200 url=/readyz
7:55AM INF Success ip=127.0.0.1 latency="14.826µs" method=GET status=200 url=/readyz
7:56AM ERR Server error error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" ip=172.17.0.1 latency=2m39.930538614s method=POST status=500 url=/v1/audio/transcriptions
Additional context
Using CPU-based local-ai results in the same error
quay.io/go-skynet/local-ai:v2.26.0-aio-cpu
Linux mb 6.8.0-55-generic #57-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 12 23:42:21 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Hi - can you please share logs with --debug ?
also, can you try to set a model name without the "."? I don't think it's a problem per-se, but the UTF-8 error is unexpected. Can you also share how are you calling the API?
Hi - can you please share logs with --debug ?
11:48AM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080
11:49AM DBG context local model name not found, setting to the first model first model name=whisper-1
11:49AM DBG guessDefaultsFromFile: not a GGUF file filePath=/build/models/CrisperWhisper.bin
11:49AM DBG Audio file copied to: /tmp/whisper4121787727/test.mp3
11:49AM INF BackendLoader starting backend=whisper modelID=CrisperWhisper.bin o.model=CrisperWhisper.bin
11:49AM DBG Loading model in memory from file: /build/models/CrisperWhisper.bin
11:49AM DBG Loading Model CrisperWhisper.bin with gRPC (file: /build/models/CrisperWhisper.bin) (backend: whisper): {backendString:whisper model:CrisperWhisper.bin modelID:CrisperWhisper.bin assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0005de008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
11:49AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/whisper
11:49AM DBG GRPC Service for CrisperWhisper.bin will be running at: '127.0.0.1:44969'
11:49AM DBG GRPC Service state dir: /tmp/go-processmanager771897440
11:49AM DBG GRPC Service Started
11:49AM DBG Wait for the service to start up
11:49AM DBG Options: ContextSize:512 Seed:1365369429 NBatch:512 MMap:true NGPULayers:99999999 Threads:8
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr 2025/03/19 11:49:18 gRPC Server listening at 127.0.0.1:44969
11:49AM DBG GRPC Service Ready
11:49AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:0xc000336e58} sizeCache:0 unknownFields:[] Model:CrisperWhisper.bin ContextSize:512 Seed:1365369429 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/build/models/CrisperWhisper.bin Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 LoadFormat: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false ModelPath:/build/models LoraAdapters:[] LoraScales:[] Options:[] CacheTypeKey: CacheTypeValue: GrammarTriggers:[]}
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_from_file_with_params_no_state: loading model from '/build/models/CrisperWhisper.bin'
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: use gpu = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: flash attn = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: gpu_device = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: dtw = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: backends = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: loading model
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_vocab = 51866
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_ctx = 1500
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_state = 1280
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_head = 20
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_layer = 32
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_ctx = 448
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_state = 1280
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_head = 20
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_layer = 32
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_mels = 128
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: ftype = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: qntvr = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: type = 5 (large v3)
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: adding 6800 extra tokens
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_langs = 100
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: CPU total size = 3094.36 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: model size = 3094.36 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv self size = 83.89 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv cross size = 251.66 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv pad size = 7.86 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (conv) = 36.13 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (encode) = 212.29 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (cross) = 9.25 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (decode) = 99.10 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_full_with_state: auto-detected language: de (p = 0.999483)
12:23PM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr 2025/03/19 12:23:14 ERROR: [core] [Server #1]grpc: server failed to encode response: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8
12:23PM ERR Server error error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" ip=10.0.2.100 latency=33m56.389020712s method=POST status=500 url=/v1/audio/transcriptions
also, can you try to set a model name without the "."?
hmm I don't understand this.
curl -s http://localhost:8080/models|jq|grep -i -A 2 -B 2 cris
},
{
"id": "CrisperWhisper.bin",
"object": "model"
},
That's also just the name in the filesystem
ls -lh localai/models/ |grep -i cr
-rw-rw---- 1 markuman markuman 2.9G Mar 17 07:58 CrisperWhisper.bin
Can you also share how are you calling the API?
import requests
import json
baseurl = "http://127.0.0.1:8080"
transcription = '/v1/audio/transcriptions'
testfile = 'test.mp3'
# transcription with whisper
############################
with open(testfile, "rb") as audio_file:
files = {"file": ("test.mp3", audio_file)}
data = {"model": "CrisperWhisper.bin"}
response = requests.post(baseurl + transcription, files=files, data=data)
print(response)
if response.status_code == 200:
raw = response.json().get('text')
else:
print(f"Whisper-Error: {response.status_code} - {response.text}")
print(raw)
I'm experiencing the very same issue. Happens only for CrisperWhisper model, all other Whisper models I tried so far just work fine. Any further details I can provide to debug this? I'd love to get CrisperWhisper running...
local-ai Version: v2.28.0
For reproducing the issue faster (w/o model conversion): ggml model here: https://huggingface.co/nyrahealth/CrisperWhisper/commit/0c039779bd37fc1fdd2bbaccaa02dbda7aac37d5#d2h-238772
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.