Whisper-WebUI
Whisper-WebUI copied to clipboard
Support for Intel IPEX (Intel ARC)
Hi,
ist there any way to run whisper with WebUI and support for Intel IPEX (Intel Arc B580).
I've found
https://github.com/openai/whisper/discussions/921 import whisper import intel_extension_for_pytorch as ipex MODEL = 'small' model = whisper.load_model(MODEL, device = "xpu")
and found https://github.com/intel/ipex-llm/tree/main/python/llm/example/GPU/HuggingFace/Multimodal/whisper
/dev/dri is connected but cpu is used. Any chance to implement this? Tried on windows with too. CPU is used.
Hi. I wish I could implement this as soon as possible. Referring to #463, I will try to implement it asap when I have time.
I hope just passing the device keyword xpu to the torch would work with the latest torch.
Hi. I wish I could implement this as soon as possible. Referring to #463, I will try to implement it asap when I have time.
I hope just passing the device keyword
xputo the torch would work with the latest torch.
Thank you. This would be very nice. I would be able to test with b580 on Windows itself and with WSL2 and Docker (Ubuntu) with ipex setup.
You can try my Fork version, which should support ARC B series, either via torch+XPU or IPEX
#509 is merged, so hopefully the intel device will work now.
But before installing, you need to edit the --extra-index-url in the requirements.txt for the intel device binary:
https://github.com/jhj0517/Whisper-WebUI/blob/d630facc110ec95ac90b87f2a31aa1c330975db9/requirements.txt#L1-L10
Use :
--extra-index-url https://download.pytorch.org/whl/xpu
Can I get a confirmation?
Shure. Thank you very much. That was fast.
First all looked good. Install went fine. Torch XPU downloaded and installed succesfully. Upon loading a file for transcribing (german), model large-v2, faster-whisper, Windows 11 24H2:
Use "faster-whisper" implementation Device "xpu" is detected
- Running on local URL: http://0.0.0.0:7870
To create a public link, set share=True in launch().
vocabulary.txt: 100%|███████████████████████████████████████████████████████████████| 460k/460k [00:00<00:00, 2.28MB/s]
C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\huggingface_hub\file_download.py:142: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Admin\Whisper-WebUI\models\Whisper\faster-whisper\models--Systran--faster-whisper-large-v2. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by settingthe HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
config.json: 100%|████████████████████████████████████████████████████████████████████████| 2.80k/2.80k [00:00<?, ?B/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████| 2.20M/2.20M [00:00<00:00, 5.37MB/s]
model.bin: 100%|██████████████████████████████████████████████████████████████████| 3.09G/3.09G [01:15<00:00, 41.0MB/s]
Traceback (most recent call last):
File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 273, in transcribe_fileMB/s]
transcribed_segments, time_for_task = self.run(
File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 172, in run
result, elapsed_time_transcription = self.transcribe(
File "C:\Users\Admin\Whisper-WebUI\modules\whisper\faster_whisper_inference.py", line 72, in transcribe
self.update_model(params.model_size, params.compute_type, progress)
File "C:\Users\Admin\Whisper-WebUI\modules\whisper\faster_whisper_inference.py", line 159, in update_model
self.model = faster_whisper.WhisperModel(
File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\faster_whisper\transcribe.py", line 647, in init
self.model = ctranslate2.models.Whisper(
ValueError: unsupported device xpu
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\queueing.py", line 625, in process_events response = await route_utils.call_process_api( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\blocks.py", line 2103, in process_api result = await self.call_function( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\blocks.py", line 1650, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 2461, in run_sync_in_worker_thread return await future File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 962, in run result = context.run(func, *args) File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\utils.py", line 890, in wrapper response = f(*args, **kwargs) File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 318, in transcribe_file raise RuntimeError(f"Error transcribing file: {e}") from e RuntimeError: Error transcribing file: unsupported device xpu
The faster-whisper uses ctranslate2 and I don't know whether ctranslate2 supports intel XPU or not.
I couldn't find anything about XPU supports in ctranslate2 documentation.
I forced XPU users to use insanely_fast_whisper implementation in #513,
Afaik the insanely_fast_whisper (same as trasnformers) implementation supports the XPU device, so hopefully it will work.
The
faster-whisperusesctranslate2and I don't know whetherctranslate2supports intel XPU or not. I couldn't find anything about XPU supports inctranslate2documentation.I forced XPU users to use
insanely_fast_whisperimplementation in #513,Afaik the
insanely_fast_whisper(same astrasnformers) implementation supports the XPU device, so hopefully it will work.
Thank you! Thats it. insanely_fast_whisper works. All good now.
Any idea if this could work or be made to work with integrated XE graphics? LLMs can already run on those via the IPEX stuff. This is the error I get:
[whisper-webui] | /Whisper-WebUI/venv/lib/python3.11/site-packages/torch/xpu/__init__.py:60: UserWarning: Failed to initialize XPU devices. The driver may not be installed, installed incorrectly, or incompatible with the current setup. Please refer to the guideline (https://github.com/pytorch/pytorch?tab=readme-ov-file#intel-gpu-support) for proper installation and configuration. (Triggered internally at /pytorch/c10/xpu/XPUFunctions.cpp:109.) [whisper-webui] | return torch._C._xpu_getDeviceCount() [whisper-webui] | Use "faster-whisper" implementation [whisper-webui] | Device "auto" is detected [whisper-webui] | * Running on local URL: http://0.0.0.0:7860
Also, tryint insanely_fast_whisper (not sure if it's needed for XPU)
app.py: error: argument --whisper_type: invalid choice: 'insanely-fast-whisper' (choose from 'whisper', 'faster-whisper', 'insanely_fast_whisper')