After running for more than ten minutes, the GPU memory is full.

Open usefordev opened this issue 1 month ago • 1 comments

When I run the command below: python whisperlivekit/basic_server.py --host 0.0.0.0 --port 8001 --model medium --model-path /root/.cache/whisper/medium.pt --backend whisper --backend-policy simulstreaming --language zh

After running for more than ten minutes, the GPU memory is full. Has anyone encountered the same problem and provided a solution?

Nov 17 '25 02:11 usefordev

I am experiencing a similar issue, however it manifests within 2 minutes for me. Have you gotten anywhere with this?

Nov 25 '25 11:11 AeneasChristodoulou