text-generation-webui
text-generation-webui copied to clipboard
How to set a number of cpu thereds to use?
I am using cpu for LLm and how predictable it is slow, but also it uses only 15-20% of cpu power. Whats may be wrong?
Transformers/Accelerate does this in CPU mode too? Ouch.
edit: Hey.. so a shot in the dark, did you try with https://github.com/zphang/transformers.git@68d640f7c368bcaaaecfc678f11908ebbd3d6176
That transformer would use multiple cores for me for GPTQ and RWKV loading.
pip uninstall transformers
pip install git+https://github.com/zphang/transformers.git@3884da12ce327667d4df5101aef3533cc32be61f
To go back:
pip uninstall transformers
pip install git+https://github.com/huggingface/transformers
Transformers/Accelerate does this in CPU mode too? Ouch.
edit: Hey.. so a shot in the dark, did you try with https://github.com/zphang/transformers.git@68d640f7c368bcaaaecfc678f11908ebbd3d6176
That transformer would use multiple cores for me for GPTQ and RWKV loading.
pip uninstall transformers pip install git+https://github.com/zphang/transformers.git@3884da12ce327667d4df5101aef3533cc32be61fTo go back:
pip uninstall transformers pip install git+https://github.com/huggingface/transformers
Oh, thank u, i will try)
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.