ipex-llm bark model on intel gpu takes 60 seconds

bark model on intel gpu takes 60 seconds

Open SlyRebula opened this issue 6 months ago • 2 comments

hello i am attempting to create text to speech with bark on intel a770 but it takes around 60 seconds to generate audio is that normal ? is there a way to make it faster like few seconds ? https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/bark

(phytia2) C:\phytia\Phytia>python ./synthesize_speech.py --text "IPEX-LLM is a library for running large language model on Intel XPU with very low latency." C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\SlyRebula\miniconda3\envs\Phytia2\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpegorlibpnginstalled before buildingtorchvisionfrom source? warn( 2024-07-31 13:47:18,476 - INFO - intel_extension_for_pytorch auto imported C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning:resume_downloadis deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True. warnings.warn( C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") 2024-07-31 13:47:22,731 - INFO - Converting the current model to sym_int4 format...... The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_maskto obtain reliable results. Settingpad_token_idtoeos_token_id:10000 for open-end generation. The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_maskto obtain reliable results. Settingpad_token_idtoeos_token_id:10000 for open-end generation. Inference time: 54.660537242889404 s

Jul 31 '24 10:07 SlyRebula

ipex-llm ipex-llm copied to clipboard

bark model on intel gpu takes 60 seconds

ipex-llm
ipex-llm copied to clipboard