WhisperFusion
WhisperFusion copied to clipboard
Uncaught TypeError in audio-processor.js
Getting this. I'm trying to view the page on windows while running it in WSL. Everything else seems OK.
Uncaught TypeError: Cannot read properties of undefined (reading 'set') at AudioStreamProcessor.process (audio-processor.js:24:23) process @ audio-processor.js:24
What browser are you using? Any chance you can try with Chrome?
Thx for the quick response. Yes, I'm using chrome currently.
If that can help, I have the same issue, Win11 PRO; OS: Win32, Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36
For a quick fix you can apply - https://github.com/collabora/WhisperFusion/issues/17#issuecomment-1918170375
also since it was mentioned in the other thread, here is my server output in case helpful:
==========
== CUDA ==
==========
CUDA Version 12.2.2
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
done loading
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
INFO:root:[LLM] loaded: True██████████████--------------------| 67.76% [103/152 00:00<00:00]
INFO:websockets.server:connection open████████████████████████| 100.00% [152/152 00:01<00:00]
INFO:websockets.server:connection open
downloading ONNX model...
loading session
loading onnx model
reset states
INFO:root:New client connected
The output looks good and the progress bar you see is the warmup stage of WhisperSpeech. Let me know if the workaround that I posted works for you. We are currently setting up a Windows machine to replicate the issue.
Here's a video with that workaround. It seems to be working, except I don't hear any text to speech (probably not surprising considering the bug i worked around). but still.. progress!
https://github.com/collabora/WhisperFusion/assets/703106/f8ac38dd-19c8-4f47-9b10-5df7587de39d
I can see in the console that the connection to 8888
failed, which is the websocket port that we use to send the audio. I guess that port is forwarded (docker container)? Also, it takes some time until the service started, it has a warmup phase, so depending on the hardware it can take like 30 seconds or more until the system is actually running.
Thanks! I went ahead and tried again, restarting everything. This time it did work. The audio was a bit more delayed than in your video, but it did all work. So presumably in my last test I just needed to either wait for a warm up or the 8888 port forwarding somehow didn't work.
What GPU do you use?
I’m on a 4090 and so was expecting it to be pretty snappy. As far as I can tell the gpu is being utilized but it’s definitely not as responsive as in your video.
On Wed, Jan 31, 2024 at 8:49 PM Marcus Edel @.***> wrote:
What GPU do you use?
— Reply to this email directly, view it on GitHub https://github.com/collabora/WhisperFusion/issues/15#issuecomment-1920507068, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLVAUES27KNB5IWXQAO23YRMND7AVCNFSM6AAAAABCQRZ3MCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRQGUYDOMBWHA . You are receiving this because you authored the thread.Message ID: @.***>
We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.
Nice - happy to help test those latency #'s when they're in.
On Wed, Jan 31, 2024 at 9:32 PM Marcus Edel @.***> wrote:
We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.
— Reply to this email directly, view it on GitHub https://github.com/collabora/WhisperFusion/issues/15#issuecomment-1920547404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLVATALUTG3E3HUJZHMPDYRMSFHAVCNFSM6AAAAABCQRZ3MCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRQGU2DONBQGQ . You are receiving this because you authored the thread.Message ID: @.***>
We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.
i am using the same GPU on windows, did u use this command to run the docker:?
docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion-3090:latest
did u get everything running eventually?