Salomón Mejía

Results 3 issues of Salomón Mejía

I am using Faster-Whisper with BatchInferencePipeline as a service, but after the model (systran-large-v3) has been loaded into memory and has performed some transcriptions, I encounter the following error: [json-exception-type_error.302]...

### Your current environment vllm serve "model_path" --quantization bitsandbytes \ --load-format bitsandbytes \ --dtype half \ --block-size 32 \ --max-model-len 10k ### 🐛 Describe the bug I am traing to...

bug