MeloTTS icon indicating copy to clipboard operation
MeloTTS copied to clipboard

Enable Intel XPU

Open Desmond0804 opened this issue 10 months ago • 2 comments

Enable Intel XPU to accelerate TTS inference since PyTorch 2.6 already support Intel XPU: https://pytorch.org/docs/stable/notes/get_start_xpu.html.

Software Perquisite: https://pytorch.org/docs/stable/notes/get_start_xpu.html#software-prerequisite

Below is my environment:

  • CPU: Intel Core Ultra 7 165H
  • RAM: 64GB
  • OS: Ubuntu 22.04.5 LTS (kernel: 6.8.0-52-generic)
  • Python version: 3.11.11

Below is my quick test to compare TTS time taken for CPU and XPU:

Language CPU (s) XPU (s)
EN 14.0701 7.2301
ZH 19.3216 13.8332
ES 11.7642 4.4456
FR 17.0136 6.8164
JP 28.5677 7.3959
KR 29.7932 10.8559

Desmond0804 avatar Feb 20 '25 06:02 Desmond0804

Copy PR to local code, it's works. Computing base on XPU(iGPU of Intel Ultra 7 155H). By the way, general GPU loading is around 30%. If there is has optimization ways, to short loading time and improve RTF?

HarryBXie avatar Feb 26 '25 10:02 HarryBXie

Sorry for late reply. A bit busy recently, I will look at the optimization part later. Can I know how you get the loading time? Previously I done my quick test using code like below:

from melo.api import TTS
import time

speed = 1.0
device = 'xpu'

text = '''Did you ever hear a folk tale about a giant turtle?'''
language = 'EN'
model = TTS(language=language, device=device)
speaker_ids = model.hps.data.spk2id
output_path = f'{language}.wav'

start = time.time()
model.tts_to_file(text, speaker_ids['EN-Default'], output_path, speed=speed)
end = time.time()
print(f"Time taken: {end-start}")

Desmond0804 avatar Mar 05 '25 06:03 Desmond0804