WhisperFusion Uncaught TypeError in audio-processor.js

Getting this. I'm trying to view the page on windows while running it in WSL. Everything else seems OK.

Uncaught TypeError: Cannot read properties of undefined (reading 'set') at AudioStreamProcessor.process (audio-processor.js:24:23) process @ audio-processor.js:24

Jan 30 '24 04:01 dskill

What browser are you using? Any chance you can try with Chrome?

Jan 30 '24 04:01 zoq

Thx for the quick response. Yes, I'm using chrome currently.

Jan 30 '24 04:01 dskill

If that can help, I have the same issue, Win11 PRO; OS: Win32, Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36

Jan 31 '24 18:01 jprovencher

For a quick fix you can apply - https://github.com/collabora/WhisperFusion/issues/17#issuecomment-1918170375

Jan 31 '24 21:01 zoq

also since it was mentioned in the other thread, here is my server output in case helpful:


==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

done loading
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
INFO:root:[LLM] loaded: True██████████████--------------------| 67.76% [103/152 00:00<00:00]
INFO:websockets.server:connection open████████████████████████| 100.00% [152/152 00:01<00:00]
INFO:websockets.server:connection open
downloading ONNX model...
loading session
loading onnx model
reset states
INFO:root:New client connected

Jan 31 '24 21:01 dskill

The output looks good and the progress bar you see is the warmup stage of WhisperSpeech. Let me know if the workaround that I posted works for you. We are currently setting up a Windows machine to replicate the issue.

Jan 31 '24 21:01 zoq

Here's a video with that workaround. It seems to be working, except I don't hear any text to speech (probably not surprising considering the bug i worked around). but still.. progress!

https://github.com/collabora/WhisperFusion/assets/703106/f8ac38dd-19c8-4f47-9b10-5df7587de39d

Jan 31 '24 21:01 dskill

I can see in the console that the connection to 8888 failed, which is the websocket port that we use to send the audio. I guess that port is forwarded (docker container)? Also, it takes some time until the service started, it has a warmup phase, so depending on the hardware it can take like 30 seconds or more until the system is actually running.

Jan 31 '24 21:01 zoq

Thanks! I went ahead and tried again, restarting everything. This time it did work. The audio was a bit more delayed than in your video, but it did all work. So presumably in my last test I just needed to either wait for a warm up or the 8888 port forwarding somehow didn't work.

Feb 01 '24 04:02 dskill

What GPU do you use?

Feb 01 '24 04:02 zoq

I’m on a 4090 and so was expecting it to be pretty snappy. As far as I can tell the gpu is being utilized but it’s definitely not as responsive as in your video.

On Wed, Jan 31, 2024 at 8:49 PM Marcus Edel @.***> wrote:

What GPU do you use?

— Reply to this email directly, view it on GitHub https://github.com/collabora/WhisperFusion/issues/15#issuecomment-1920507068, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLVAUES27KNB5IWXQAO23YRMND7AVCNFSM6AAAAABCQRZ3MCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRQGUYDOMBWHA . You are receiving this because you authored the thread.Message ID: @.***>

Feb 01 '24 05:02 dskill

We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.

Feb 01 '24 05:02 zoq

Nice - happy to help test those latency #'s when they're in.

On Wed, Jan 31, 2024 at 9:32 PM Marcus Edel @.***> wrote:

We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.

— Reply to this email directly, view it on GitHub https://github.com/collabora/WhisperFusion/issues/15#issuecomment-1920547404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLVATALUTG3E3HUJZHMPDYRMSFHAVCNFSM6AAAAABCQRZ3MCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRQGU2DONBQGQ . You are receiving this because you authored the thread.Message ID: @.***>

Feb 01 '24 07:02 dskill

We are going to add latency outputs for each step to the demo tomorrow, so we compare it easier. The video was recorded on a 4090, so you should see very similar results.

i am using the same GPU on windows, did u use this command to run the docker:? docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion-3090:latest did u get everything running eventually?

Apr 26 '24 06:04 sadimoodi

WhisperFusion WhisperFusion copied to clipboard

Uncaught TypeError in audio-processor.js

WhisperFusion
WhisperFusion copied to clipboard