text-generation-webui
text-generation-webui copied to clipboard
whisper_stt "Error"
Describe the bug
After enabling both silero_tts and whisper_stt extensions in the "Interface mode" tab, applying and restarting the interface, whisper_stt results in an "Error" message when trying to use the micrphone to record a prompt. No user input displays and right away a random voice response from the assitant is recieved.
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
- Enable both silero_tts and whisper_stt.
- Record a prompt.
Screenshot
Logs
Starting the web UI...
Warning: --cai-chat is deprecated. Use --chat instead.
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: D:\Tools\oobabooga-windows\installer_files\env\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll...
Loading anon8231489123_vicuna-13b-GPTQ-4bit-128g...
Found the following quantized model: models\anon8231489123_vicuna-13b-GPTQ-4bit-128g\vicuna-13b-4bit-128g.safetensors
Loading model ...
D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(filename, framework="pt", device=device) as f:
D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\torch\_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = cls(wrap_storage=untyped_storage)
Done.
Loaded the model in 4.10 seconds.
Loading the extension "gallery"... Ok.
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Closing server running on port: 7860
Loading the extension "gallery"... Ok.
Loading the extension "silero_tts"...
Using Silero TTS cached checkpoint found at C:\Users\anahum/.cache\torch\hub
Ok.
Loading the extension "whisper_stt"... Ok.
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\pydub\utils.py:198: RuntimeWarning: Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work
warn("Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work", RuntimeWarning)
Traceback (most recent call last):
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\processing_utils.py", line 138, in audio_from_file
audio = AudioSegment.from_file(filename)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\pydub\audio_segment.py", line 728, in from_file
info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\pydub\utils.py", line 274, in mediainfo_json
res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "D:\Tools\oobabooga-windows\installer_files\env\lib\subprocess.py", line 1440, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1106, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 995, in preprocess_data
processed_input.append(block.preprocess(inputs[i]))
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\components.py", line 2306, in preprocess
sample_rate, data = processing_utils.audio_from_file(
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\processing_utils.py", line 148, in audio_from_file
raise RuntimeError(msg) from e
RuntimeError: Cannot load audio from file: `ffprobe` not found. Please install `ffmpeg` in your system to use non-WAV audio file formats and make sure `ffprobe` is in your PATH.
Output generated in 8.13 seconds (7.13 tokens/s, 58 tokens, context 69, seed 1632075903)
Traceback (most recent call last):
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\processing_utils.py", line 138, in audio_from_file
audio = AudioSegment.from_file(filename)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\pydub\audio_segment.py", line 728, in from_file
info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\pydub\utils.py", line 274, in mediainfo_json
res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "D:\Tools\oobabooga-windows\installer_files\env\lib\subprocess.py", line 1440, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1106, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 995, in preprocess_data
processed_input.append(block.preprocess(inputs[i]))
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\components.py", line 2306, in preprocess
sample_rate, data = processing_utils.audio_from_file(
File "D:\Tools\oobabooga-windows\installer_files\env\lib\site-packages\gradio\processing_utils.py", line 148, in audio_from_file
raise RuntimeError(msg) from e
RuntimeError: Cannot load audio from file: `ffprobe` not found. Please install `ffmpeg` in your system to use non-WAV audio file formats and make sure `ffprobe` is in your PATH.
Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
File "D:\Tools\oobabooga-windows\installer_files\env\lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "D:\Tools\oobabooga-windows\installer_files\env\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
Output generated in 3.27 seconds (7.96 tokens/s, 26 tokens, context 128, seed 1574481083)
System Info
NVIDIA RTX 3090
I have the same issue with this extension.
I'm also running into this (Windows 11, RTX 4090, one-click installer). I've tried a few things, like installing ffmpeg at the system level and adding that to PATH, as well as adding a bin folder to the Oobabooga directory and adding that to PATH, but still get the error with Whisper (regardless if Solero is activated or not). Funnily enough I'm seeing the exact same error in a Stable Diffusion Automatic1111 extension (SadTalker) due to the same non WAV audio ffmpeg dependency. I wonder if it's a Gradio thing, as I saw the same error appear in this issue https://github.com/gradio-app/gradio/issues/3429
same issue here guys! :(
The same issue. How can we make it work?
Ok, I found a solution: https://phoenixnap.com/kb/ffmpeg-windows I did this and it works for me.
I had to restart my computer after installing the ffmpeg as above, otherwise it worked well..
could you guys tell me where you cloned the FFmpeg repository to if it matters? also do i have to set it as a system variable at all? i'm going to mess around for the time being and find out
Ok, I found a solution: https://phoenixnap.com/kb/ffmpeg-windows I did this and it works for me.
I did that and the previous error disappeared but now i get another when i try to record with whisper.
Traceback (most recent call last):
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\routes.py", line 427, in run_predict
output = await app.get_blocks().process_api(
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1323, in process_api
result = await self.call_function(
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1051, in call_function
prediction = await anyio.to_thread.run_sync(
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
result = context.run(func, *args)
File "H:\oobabooga_windows\text-generation-webui\extensions\whisper_stt\script.py", line 48, in auto_transcribe
transcription = do_stt(audio, whipser_model, whipser_language)
File "H:\oobabooga_windows\text-generation-webui\extensions\whisper_stt\script.py", line 36, in do_stt
transcription = r.recognize_whisper(audio_data, language=whipser_language, model=whipser_model)
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\speech_recognition\__init__.py", line 1479, in recognize_whisper
import whisper
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\whisper\__init__.py", line 13, in <module>
from .model import ModelDimensions, Whisper
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\whisper\model.py", line 13, in <module>
from .transcribe import transcribe as transcribe_function
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\whisper\transcribe.py", line 20, in <module>
from .timing import add_word_timestamps
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\whisper\timing.py", line 7, in <module>
import numba
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\numba\__init__.py", line 55, in <module>
_ensure_critical_deps()
File "H:\oobabooga_windows\installer_files\env\lib\site-packages\numba\__init__.py", line 42, in _ensure_critical_deps raise ImportError("Numba needs NumPy 1.24 or less")
ImportError: Numba needs NumPy 1.24 or less
I asked "TheBloke_Octocoder-GPTQ" model for help on this error above zombie-dude posted, it said miniconda doesn't have all the required files... I ran cmd_windows.bat from the oogabooga folder then pasted suggestion from "TheBloke_Octocoder-GPTQ" model conda install -c conda-forge librosa works for me, (i haven't run update_windows.bat to see if it breaks again)
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
This is still an issue for some. This post provides the only viable solution to this issue as the oobabooga install instructions do not provide this insight and some the ffmpeg install methods seem to have issues.
This post provided the link to the correct instructions.
conda install ffmpeg solved this issue for me on Windows.