ChatGLM-6B
ChatGLM-6B copied to clipboard
请问:ChatGLM-6B-INT8跟load(ChatGLM-6B).quantize(8)有什么区别吗?
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
请问ChatGLM-6B-INT8跟load(ChatGLM-6B).quantize(8)有什么区别吗?
Expected Behavior
No response
Steps To Reproduce
无
Environment
无
Anything else?
No response
What model were you trying to use?
same proble to me,vicuna-13b-GPTQ-4bit-128g,windows10, 8700,2080ti
Which version of GPTQ are you using? oobabooga's or latest?
oobabooga's
@Zach9113
python -m pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl --force-reinstall
Also, make sure that you have the cuda branch of the GPTQ repo.
git clone https://github.com/oobabooga/GPTQ-for-LLaMa -b cuda
You should have it as oobabooga removed the other branch, but check to be sure.
There was an update to the GPTQ code in the webui recently. Make sure your webui is updated. The new code may require you to switch to latest GPTQ. I haven't updated yet myself, so I don't know.
where do i run that command in the repositories folder?
I ran that command and it didn't change anything ile "C:\AI\oobabooga-windows\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 426, in forward quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize) NameError: name 'quant_cuda' is not defined
@Zach9113 It needs to be entered after opening the cmd.bat script. That will allow you to modify the virtual environment that the webui is installed with. I don't know if that command will fix the issue. As I said before, the GPTQ code in the webui was changed recently and I haven't had time to test anything.
Edit: I just re-installed and everything is working for me. Looking at the code in quant.py, I don't see why you would get that error. If quant_cuda is missing, then you would get a different error. If quant_cuda is loaded, then you shouldn't get that error at all. My knowledge of Python simply isn't good enough to know what the issue is.
@jllllll when i reinstalled everything there was an error with Cuda and it said it was set to 0.0.0 . I'm at work rn when I get off I'm going to start from scratch. C:\AI\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( running bdist_egg running egg_info creating quant_cuda.egg-info writing quant_cuda.egg-info\PKG-INFO writing dependency_links to quant_cuda.egg-info\dependency_links.txt writing top-level names to quant_cuda.egg-info\top_level.txt writing manifest file 'quant_cuda.egg-info\SOURCES.txt' reading manifest file 'quant_cuda.egg-info\SOURCES.txt' writing manifest file 'quant_cuda.egg-info\SOURCES.txt' installing library code to build\bdist.win-amd64\egg running install_lib running build_ext error: [WinError 2] The system cannot find the file specified When I get the web UI running it does tell me CUDA extension not installed>
same with me now, after doing everything
same for me. During install I get this:
RuntimeError: Error compiling objects for extension
CUDA kernel compilation failed.
Attempting installation with wheel.
Collecting quant-cuda==0.0.0
Using cached https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl (398 kB)
I confirmed that the repo branch being cloned is in fact: git clone https://github.com/oobabooga/GPTQ-for-LLaMa -b cuda
but when trying to start up I too see the CUDA extension not installed
bin Z:\oobabooga\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll
Loading anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g...
CUDA extension not installed.
Found the following quantized model: models\anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g\gpt-x-alpaca-13b-native-4bit-128g-cuda.pt
Loading model ...
Done.
When trying to interact - similar errors:
File "Z:\oobabooga\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 426, in forward
quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
NameError: name 'quant_cuda' is not defined
Output generated in 0.31 seconds (0.00 tokens/s, 0 tokens, context 35, seed 1592413025)
Been trying to resolve this for weeks now across several versions textGenUI and one-click installer.
RTX 3090 & system with 64GB ram
@Zach9113
python -m pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl --force-reinstall
Also, make sure that you have the cuda branch of the GPTQ repo.
git clone https://github.com/oobabooga/GPTQ-for-LLaMa -b cuda
You should have it as oobabooga removed the other branch, but check to be sure.There was an update to the GPTQ code in the webui recently. Make sure your webui is updated. The new code may require you to switch to latest GPTQ. I haven't updated yet myself, so I don't know.
oh hell yea this just fixed all my problems I have been having with oobabooga I had 1 model working kind of before this was slow as and didn't even know it was an ai got really weird it told me it was impossible that it lived in a folder on a PC lol not sure how I did that one.
@Zach9113
python -m pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl --force-reinstall
Also, make sure that you have the cuda branch of the GPTQ repo.
git clone https://github.com/oobabooga/GPTQ-for-LLaMa -b cuda
You should have it as oobabooga removed the other branch, but check to be sure.There was an update to the GPTQ code in the webui recently. Make sure your webui is updated. The new code may require you to switch to latest GPTQ. I haven't updated yet myself, so I don't know.
This fixed it for me too. My generation speed also is lot faster now with the quantized models. Just running that command in the cmd_windows.bat was enough for me.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.