HelenaSak comments

Repositories
Issues
Comments

Results 3 comments of


                                            HelenaSak

[Bug]: watchdog thread terminated with exception: CUDA error: an illegal memory access was encountered

Hello! Yes, I have the same problem today for some unknown reason. First time in six months. I use version 0.6.4.post1 and llama3.3. Here is error log: model_runner_base.py:120] Writing input...

[Usage]: Throughput and quality issue with vllm 0.6.0.

Hello! For the Llama 3.1 70B AWQ 4bit model on 1 x A100, version 0.6.0 even became a little worse. I conduct a test using the comparative benchmark_throught.py: Version 0.6.0...

BatchedInferencePipeline degrades transcription quality heavily

> > It's not a granted thing that batched transcription is worse than sequential, in fact, there are multiple reports in the repo that batched is better than sequential [#936...