Enable Intel XPU
Enable Intel XPU to accelerate TTS inference since PyTorch 2.6 already support Intel XPU: https://pytorch.org/docs/stable/notes/get_start_xpu.html.
Software Perquisite: https://pytorch.org/docs/stable/notes/get_start_xpu.html#software-prerequisite
Below is my environment:
- CPU: Intel Core Ultra 7 165H
- RAM: 64GB
- OS: Ubuntu 22.04.5 LTS (kernel: 6.8.0-52-generic)
- Python version: 3.11.11
Below is my quick test to compare TTS time taken for CPU and XPU:
| Language | CPU (s) | XPU (s) |
|---|---|---|
| EN | 14.0701 | 7.2301 |
| ZH | 19.3216 | 13.8332 |
| ES | 11.7642 | 4.4456 |
| FR | 17.0136 | 6.8164 |
| JP | 28.5677 | 7.3959 |
| KR | 29.7932 | 10.8559 |
Copy PR to local code, it's works. Computing base on XPU(iGPU of Intel Ultra 7 155H). By the way, general GPU loading is around 30%. If there is has optimization ways, to short loading time and improve RTF?
Sorry for late reply. A bit busy recently, I will look at the optimization part later. Can I know how you get the loading time? Previously I done my quick test using code like below:
from melo.api import TTS
import time
speed = 1.0
device = 'xpu'
text = '''Did you ever hear a folk tale about a giant turtle?'''
language = 'EN'
model = TTS(language=language, device=device)
speaker_ids = model.hps.data.spk2id
output_path = f'{language}.wav'
start = time.time()
model.tts_to_file(text, speaker_ids['EN-Default'], output_path, speed=speed)
end = time.time()
print(f"Time taken: {end-start}")