your Hugging face demo is super fast compare to my local inference
Self Checks
- [X] This template is only for bug reports. For questions, please visit Discussions.
- [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- [X] I have searched for existing issues, including closed ones. Search issues
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Source)
Environment Details
windows + cuda
Steps to Reproduce
with my equivalent hardware 4090 your HF demo is 10 time faster. I use the acceleration in your guide and get about 16it/s. are you doing something specific in you HF demo like pretraining for the speakers?
✔️ Expected Behavior
expecting equivalent timing
❌ Actual Behavior
10 time longer
Compile enabled or not?
probably not because i try to run with the compile flag : python tools/run_webui.py --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --compile
I get the error :
2024-12-14 16:58:02.996 | INFO | tools.vqgan.inference:load_model:43 - Loaded model: <All keys matched successfully>
2024-12-14 16:58:02.996 | INFO | main::71 - Decoder model loaded, warming up...
2024-12-14 16:58:03.008 | INFO | tools.llama.generate:generate_long:789 - Encoded text: Hello world.
2024-12-14 16:58:03.008 | INFO | tools.llama.generate:generate_long:807 - Generating sentence 1/1 of sample 1/1
C:\holobot\fish\fish-speech\fish_speech\models\text2semantic\llama.py:790: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
y = F.scaled_dot_product_attention(
0%| | 0/1023 [00:00<?, ?it/s]C:\Users\olivi\anaconda3\envs\fish-speech\lib\contextlib.py:103: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
self.gen = func(*args, **kwds)C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe
C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe
------------------------cc
C:\Program Files\LLVM\bin\clang.EXE
In file included from C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:10:
In file included from C:\Users\olivi\anaconda3\envs\fish-speech\Include\Python.h:118:
C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:120:59: warning: declaration of 'struct timeval' will
not be visible outside of this function [-Wvisibility]
120 | PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv);
| ^
C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:127:12: warning: declaration of 'struct timeval' will
not be visible outside of this function [-Wvisibility]
127 | struct timeval *tv,
| ^
C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:132:12: warning: declaration of 'struct timeval' will
not be visible outside of this function [-Wvisibility]
132 | struct timeval *tv,
| ^
C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:21:3: warning: 'strcat' is deprecated: This function or variable
may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online
help for details. [-Wdeprecated-declarations]
21 | strcat(err, prefix);
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\string.h:91:5: note: 'strcat' has been explicitly
marked deprecated here
91 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1(
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:835:5: note: expanded from macro
'__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1'
835 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX(_ReturnType, _ReturnPolicy, _DeclSpec, _FuncName, _FuncName##_s, ...
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:1894:17: note: expanded from macro
'__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX'
1894 | _CRT_INSECURE_DEPRECATE(_SecureFuncName) _DeclSpec _ReturnType __cdecl _FuncName(_SalAttributeDs...
| ^
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:356:55: note:
expanded from macro '_CRT_INSECURE_DEPRECATE'
356 | #define _CRT_INSECURE_DEPRECATE(_Replacement) _CRT_DEPRECATE_TEXT(
| ^
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:346:47: note:
expanded from macro '_CRT_DEPRECATE_TEXT'
346 | #define _CRT_DEPRECATE_TEXT(_Text) __declspec(deprecated(_Text))
| ^
C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:22:3: warning: 'strcat' is deprecated: This function or variable
may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online
help for details. [-Wdeprecated-declarations]
22 | strcat(err, str);
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\string.h:91:5: note: 'strcat' has been explicitly
marked deprecated here
91 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1(
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:835:5: note: expanded from macro
'__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1'
835 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX(_ReturnType, _ReturnPolicy, _DeclSpec, _FuncName, _FuncName##_s, ...
| ^
C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:1894:17: note: expanded from macro
'__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX'
1894 | _CRT_INSECURE_DEPRECATE(_SecureFuncName) _DeclSpec _ReturnType __cdecl _FuncName(_SalAttributeDs...
| ^
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:356:55: note:
expanded from macro '_CRT_INSECURE_DEPRECATE'
356 | #define _CRT_INSECURE_DEPRECATE(_Replacement) _CRT_DEPRECATE_TEXT(
| ^
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:346:47: note:
expanded from macro '_CRT_DEPRECATE_TEXT'
346 | #define _CRT_DEPRECATE_TEXT(_Text) __declspec(deprecated(_Text))
| ^
5 warnings generated.
Creating library C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\cuda_utils.cp310-win_amd64.lib and object C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\cuda_utils.cp310-win_amd64.exp
C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe
Switch to Linux and enable compile, then you will get much more faster inference:)
my all project is on windows! not my choice :) Maybe using the docker on windows?
Yes, you can also use WSL(Windows Subsystem for Linux) to run this repo.Compile is available in WSL.
I'll try one of this thx
"Compile is available in WSL"
is there a documentation or an explanation? c'an't find it in the repo
probably not because i try to run with the compile flag : python tools/run_webui.py --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --compile
I get the error :
2024-12-14 16:58:02.996 | INFO | tools.vqgan.inference:load_model:43 - Loaded model: 2024-12-14 16:58:02.996 | INFO | main::71 - Decoder model loaded, warming up... 2024-12-14 16:58:03.008 | INFO | tools.llama.generate:generate_long:789 - Encoded text: Hello world. 2024-12-14 16:58:03.008 | INFO | tools.llama.generate:generate_long:807 - Generating sentence 1/1 of sample 1/1 C:\holobot\fish\fish-speech\fish_speech\models\text2semantic\llama.py:790: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) y = F.scaled_dot_product_attention( 0%| | 0/1023 [00:00<?, ?it/s]C:\Users\olivi\anaconda3\envs\fish-speech\lib\contextlib.py:103: FutureWarning:
torch.backends.cuda.sdp_kernel()is deprecated. In the future, this context manager will be removed. Please seetorch.nn.attention.sdpa_kernel()for the new context manager, with updated signature. self.gen = func(*args, **kwds)C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe
C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe ------------------------cc C:\Program Files\LLVM\bin\clang.EXE In file included from C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:10: In file included from C:\Users\olivi\anaconda3\envs\fish-speech\Include\Python.h:118: C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:120:59: warning: declaration of 'struct timeval' will not be visible outside of this function [-Wvisibility] 120 | PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv); | ^ C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:127:12: warning: declaration of 'struct timeval' will not be visible outside of this function [-Wvisibility] 127 | struct timeval *tv, | ^ C:\Users\olivi\anaconda3\envs\fish-speech\Include/cpython/pytime.h:132:12: warning: declaration of 'struct timeval' will not be visible outside of this function [-Wvisibility] 132 | struct timeval *tv, | ^ C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:21:3: warning: 'strcat' is deprecated: This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. [-Wdeprecated-declarations] 21 | strcat(err, prefix); | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\string.h:91:5: note: 'strcat' has been explicitly marked deprecated here 91 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1( | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:835:5: note: expanded from macro '__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1' 835 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX(_ReturnType, _ReturnPolicy, _DeclSpec, _FuncName, _FuncName##_s, ... | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:1894:17: note: expanded from macro '__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX' 1894 | _CRT_INSECURE_DEPRECATE(_SecureFuncName) _DeclSpec _ReturnType __cdecl _FuncName(_SalAttributeDs... | ^ C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:356:55: note: expanded from macro '_CRT_INSECURE_DEPRECATE' 356 | #define _CRT_INSECURE_DEPRECATE(_Replacement) _CRT_DEPRECATE_TEXT( | ^ C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:346:47: note: expanded from macro '_CRT_DEPRECATE_TEXT' 346 | #define _CRT_DEPRECATE_TEXT(_Text) __declspec(deprecated(_Text)) | ^ C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\main.c:22:3: warning: 'strcat' is deprecated: This function or variable may be unsafe. Consider using strcat_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. [-Wdeprecated-declarations] 22 | strcat(err, str); | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\string.h:91:5: note: 'strcat' has been explicitly marked deprecated here 91 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1( | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:835:5: note: expanded from macro '__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1' 835 | __DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX(_ReturnType, _ReturnPolicy, _DeclSpec, _FuncName, _FuncName##_s, ... | ^ C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt\corecrt.h:1894:17: note: expanded from macro '__DEFINE_CPP_OVERLOAD_STANDARD_FUNC_0_1_EX' 1894 | _CRT_INSECURE_DEPRECATE(_SecureFuncName) _DeclSpec _ReturnType __cdecl _FuncName(_SalAttributeDs... | ^ C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:356:55: note: expanded from macro '_CRT_INSECURE_DEPRECATE' 356 | #define _CRT_INSECURE_DEPRECATE(_Replacement) _CRT_DEPRECATE_TEXT( | ^ C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include\vcruntime.h:346:47: note: expanded from macro '_CRT_DEPRECATE_TEXT' 346 | #define _CRT_DEPRECATE_TEXT(_Text) __declspec(deprecated(_Text)) | ^ 5 warnings generated. Creating library C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\cuda_utils.cp310-win_amd64.lib and object C:\Users\olivi\AppData\Local\Temp\tmproqjsx49\cuda_utils.cp310-win_amd64.exp C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe C:\Users\olivi\anaconda3\envs\fish-speech\lib\site-packages\triton\backends\nvidia\bin\ptxas.exe
these were not error, but just logs
Works like a charm WSL running Ubunty with your linux instructions, thx