Whisper-WebUI icon indicating copy to clipboard operation
Whisper-WebUI copied to clipboard

Support for Intel IPEX (Intel ARC)

Open DediCATeD88 opened this issue 9 months ago • 8 comments

Hi,

ist there any way to run whisper with WebUI and support for Intel IPEX (Intel Arc B580).

I've found

https://github.com/openai/whisper/discussions/921 import whisper import intel_extension_for_pytorch as ipex MODEL = 'small' model = whisper.load_model(MODEL, device = "xpu")

and found https://github.com/intel/ipex-llm/tree/main/python/llm/example/GPU/HuggingFace/Multimodal/whisper

/dev/dri is connected but cpu is used. Any chance to implement this? Tried on windows with too. CPU is used.

DediCATeD88 avatar Feb 28 '25 09:02 DediCATeD88

Hi. I wish I could implement this as soon as possible. Referring to #463, I will try to implement it asap when I have time.

I hope just passing the device keyword xpu to the torch would work with the latest torch.

jhj0517 avatar Feb 28 '25 10:02 jhj0517

Hi. I wish I could implement this as soon as possible. Referring to #463, I will try to implement it asap when I have time.

I hope just passing the device keyword xpu to the torch would work with the latest torch.

Thank you. This would be very nice. I would be able to test with b580 on Windows itself and with WSL2 and Docker (Ubuntu) with ipex setup.

DediCATeD88 avatar Feb 28 '25 10:02 DediCATeD88

You can try my Fork version, which should support ARC B series, either via torch+XPU or IPEX

DDXDB avatar Mar 07 '25 04:03 DDXDB

#509 is merged, so hopefully the intel device will work now.

But before installing, you need to edit the --extra-index-url in the requirements.txt for the intel device binary:

https://github.com/jhj0517/Whisper-WebUI/blob/d630facc110ec95ac90b87f2a31aa1c330975db9/requirements.txt#L1-L10

Use :

--extra-index-url https://download.pytorch.org/whl/xpu 

Can I get a confirmation?

jhj0517 avatar Mar 07 '25 17:03 jhj0517

Shure. Thank you very much. That was fast.

First all looked good. Install went fine. Torch XPU downloaded and installed succesfully. Upon loading a file for transcribing (german), model large-v2, faster-whisper, Windows 11 24H2:

Use "faster-whisper" implementation Device "xpu" is detected

  • Running on local URL: http://0.0.0.0:7870

To create a public link, set share=True in launch(). vocabulary.txt: 100%|███████████████████████████████████████████████████████████████| 460k/460k [00:00<00:00, 2.28MB/s] C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\huggingface_hub\file_download.py:142: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Admin\Whisper-WebUI\models\Whisper\faster-whisper\models--Systran--faster-whisper-large-v2. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by settingthe HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations. To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development warnings.warn(message) config.json: 100%|████████████████████████████████████████████████████████████████████████| 2.80k/2.80k [00:00<?, ?B/s] tokenizer.json: 100%|█████████████████████████████████████████████████████████████| 2.20M/2.20M [00:00<00:00, 5.37MB/s] model.bin: 100%|██████████████████████████████████████████████████████████████████| 3.09G/3.09G [01:15<00:00, 41.0MB/s] Traceback (most recent call last): File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 273, in transcribe_fileMB/s] transcribed_segments, time_for_task = self.run( File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 172, in run result, elapsed_time_transcription = self.transcribe( File "C:\Users\Admin\Whisper-WebUI\modules\whisper\faster_whisper_inference.py", line 72, in transcribe self.update_model(params.model_size, params.compute_type, progress) File "C:\Users\Admin\Whisper-WebUI\modules\whisper\faster_whisper_inference.py", line 159, in update_model self.model = faster_whisper.WhisperModel( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\faster_whisper\transcribe.py", line 647, in init self.model = ctranslate2.models.Whisper( ValueError: unsupported device xpu

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\queueing.py", line 625, in process_events response = await route_utils.call_process_api( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\blocks.py", line 2103, in process_api result = await self.call_function( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\blocks.py", line 1650, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 2461, in run_sync_in_worker_thread return await future File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 962, in run result = context.run(func, *args) File "C:\Users\Admin\Whisper-WebUI\venv\lib\site-packages\gradio\utils.py", line 890, in wrapper response = f(*args, **kwargs) File "C:\Users\Admin\Whisper-WebUI\modules\whisper\base_transcription_pipeline.py", line 318, in transcribe_file raise RuntimeError(f"Error transcribing file: {e}") from e RuntimeError: Error transcribing file: unsupported device xpu

Image

DediCATeD88 avatar Mar 07 '25 19:03 DediCATeD88

The faster-whisper uses ctranslate2 and I don't know whether ctranslate2 supports intel XPU or not. I couldn't find anything about XPU supports in ctranslate2 documentation.

I forced XPU users to use insanely_fast_whisper implementation in #513,

Afaik the insanely_fast_whisper (same as trasnformers) implementation supports the XPU device, so hopefully it will work.

jhj0517 avatar Mar 08 '25 10:03 jhj0517

The faster-whisper uses ctranslate2 and I don't know whether ctranslate2 supports intel XPU or not. I couldn't find anything about XPU supports in ctranslate2 documentation.

I forced XPU users to use insanely_fast_whisper implementation in #513,

Afaik the insanely_fast_whisper (same as trasnformers) implementation supports the XPU device, so hopefully it will work.

Thank you! Thats it. insanely_fast_whisper works. All good now.

DediCATeD88 avatar Mar 08 '25 10:03 DediCATeD88

Any idea if this could work or be made to work with integrated XE graphics? LLMs can already run on those via the IPEX stuff. This is the error I get: [whisper-webui] | /Whisper-WebUI/venv/lib/python3.11/site-packages/torch/xpu/__init__.py:60: UserWarning: Failed to initialize XPU devices. The driver may not be installed, installed incorrectly, or incompatible with the current setup. Please refer to the guideline (https://github.com/pytorch/pytorch?tab=readme-ov-file#intel-gpu-support) for proper installation and configuration. (Triggered internally at /pytorch/c10/xpu/XPUFunctions.cpp:109.) [whisper-webui] | return torch._C._xpu_getDeviceCount() [whisper-webui] | Use "faster-whisper" implementation [whisper-webui] | Device "auto" is detected [whisper-webui] | * Running on local URL: http://0.0.0.0:7860 Also, tryint insanely_fast_whisper (not sure if it's needed for XPU)

app.py: error: argument --whisper_type: invalid choice: 'insanely-fast-whisper' (choose from 'whisper', 'faster-whisper', 'insanely_fast_whisper')

dinccey avatar Jun 11 '25 11:06 dinccey