text-generation-webui Installing GPTq for llama and cuant cuda

trafficstars

Describe the bug

installing cuant cuda always leads to a bug i recently tried to do it inside of a x64 Native Tools Command Prompt for VS 2019 but i got an error

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Run the install for gptq for llama

Screenshot

No response

Logs

C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer, pypa/build or
        other standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer, pypa/build or
        other standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
writing quant_cuda.egg-info\PKG-INFO
writing dependency_links to quant_cuda.egg-info\dependency_links.txt
writing top-level names to quant_cuda.egg-info\top_level.txt
reading manifest file 'quant_cuda.egg-info\SOURCES.txt'
writing manifest file 'quant_cuda.egg-info\SOURCES.txt'
installing library code to build\bdist.win32\egg
running install_lib
running build_ext
Traceback (most recent call last):
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\text-generation-webui\repositories\GPTQ-for-LLaMa\setup_cuda.py", line 4, in <module>
    setup(
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
    return run_commands(dist)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
    dist.run_commands()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\dist.py", line 1244, in run_command
    super().run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
    cmd_obj.run()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\install.py", line 80, in run
    self.do_egg_install()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\install.py", line 129, in do_egg_install
    self.run_command('bdist_egg')
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\dist.py", line 1244, in run_command
    super().run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
    cmd_obj.run()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\bdist_egg.py", line 164, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\bdist_egg.py", line 150, in call_command
    self.run_command(cmdname)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\dist.py", line 1244, in run_command
    super().run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
    cmd_obj.run()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\install_lib.py", line 11, in run
    self.build()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\command\install_lib.py", line 111, in build
    self.run_command('build_ext')
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\dist.py", line 1244, in run_command
    super().run_command(command)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
    cmd_obj.run()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
    _build_ext.run(self)
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 345, in run
    self.build_extensions()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\torch\utils\cpp_extension.py", line 485, in build_extensions
    compiler_name, compiler_version = self._check_abi()
  File "C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\torch\utils\cpp_extension.py", line 875, in _check_abi
    raise UserWarning(msg)
UserWarning: It seems that the VC environment is activated but DISTUTILS_USE_SDK is not set.This may lead to multiple activations of the VC env.Please set `DISTUTILS_USE_SDK=1` and try again.

Done!
Press any key to continue . . .
PS C:\Users\Nick\source\repos>

System Info

3060 and 32gb ram

May 09 '23 08:05 NicolasMejiaPetit

Powershell:

$env:DISTUTILS_USE_SDK = 1

CMD:

SET DISTUTILS_USE_SDK=1

Run that command just before running python setup_cuda.py install

May 10 '23 18:05 jllllll

just tried it but i get this error " Successfully installed accelerate-0.18.0 datasets-2.10.1 safetensors-0.3.0 transformers-4.28.0 Processing c:\users\nick\desktop\5_10_webui\oobabooga_windows\oobabooga_windows\text-generation-webui\repositories\gptq-for-llama Preparing metadata (setup.py) ... done Building wheels for collected packages: quant-cuda Building wheel for quant-cuda (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [4 lines of output] running bdist_wheel running build running build_ext error: [WinError 2] The system cannot find the file specified [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for quant-cuda Running setup.py clean for quant-cuda Failed to build quant-cuda ERROR: Could not build wheels for quant-cuda, which is required to install pyproject.toml-based projects

Done! Press any key to continue . . . " I just tried diff times, 1 by redoing the one click installer inside the tools 2 by updating it inside the tools 3 updating it in administrator

May 10 '23 23:05 NicolasMejiaPetit

I recently saw a similar error and fixed it with this: conda install cuda -c nvidia/label/cuda-11.7.0 It doesn't help that pip isn't reporting what file it failed to find.

May 10 '23 23:05 jllllll

:( still wants to say the same error after attempting that. "Successfully installed accelerate-0.18.0 datasets-2.10.1 safetensors-0.3.0 transformers-4.28.0 Processing c:\users\nick\desktop\5_10_webui\oobabooga_windows\oobabooga_windows\text-generation-webui\repositories\gptq-for-llama Preparing metadata (setup.py) ... done Building wheels for collected packages: quant-cuda Building wheel for quant-cuda (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [4 lines of output] running bdist_wheel running build running build_ext error: [WinError 2] The system cannot find the file specified [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for quant-cuda Running setup.py clean for quant-cuda Failed to build quant-cuda ERROR: Could not build wheels for quant-cuda, which is required to install pyproject.toml-based projects

Done! Press any key to continue . . ."

May 11 '23 01:05 NicolasMejiaPetit

If you're okay with using somebody else's build of quant_cuda, then you can open cmd_windows.bat and enter this command:

python -m pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl --force-reinstall

There are just too many things that can go wrong when compiling gptq. It is hard to know what is going on here without a more detailed error message. If you want to continue trying to compile gptq yourself, add -v to the setup command like so:

python -v setup_cuda.py install

That should hopefully provide more details.

May 11 '23 03:05 jllllll

import 'distutils.command.install_scripts' # <_frozen_importlib_external.SourceFileLoader object at 0x0000025B508BE0B0>
import 'setuptools.command.install_scripts' # <_frozen_importlib_external.SourceFileLoader object at 0x0000025B508BFAC0>
running bdist_egg
running egg_info
writing quant_cuda.egg-info\PKG-INFO
writing dependency_links to quant_cuda.egg-info\dependency_links.txt
writing top-level names to quant_cuda.egg-info\top_level.txt
reading manifest file 'quant_cuda.egg-info\SOURCES.txt'
writing manifest file 'quant_cuda.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_ext
error: [WinError 2] Le fichier spécifié est introuvable (file not found)
# clear builtins._

These are the logs i've been getting in verbose mode. You can find the full file in attachment.

logs.txt

May 11 '23 09:05 Paillat-dev

I have the same problem. When I used the verbose command it has showed me this error: python: can't open file 'I:\\AI\\oobabooga\\setup_cuda.py': [Errno 2] No such file or directory

May 11 '23 11:05 NeoAnthropocene

I have the same problem. When I used the verbose command it has showed me this error: python: can't open file 'I:\\AI\\oobabooga\\setup_cuda.py': [Errno 2] No such file or directory

That's not the same issue. Your issue means that you haven't cd repositories/GPTQ-for-LLaMa

May 11 '23 12:05 Paillat-dev

If you're okay with using somebody else's build of quant_cuda, then you can open cmd_windows.bat and enter this command:
python -m pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl --force-reinstall
There are just too many things that can go wrong when compiling gptq. It is hard to know what is going on here without a more detailed error message. If you want to continue trying to compile gptq yourself, add -v to the setup command like so:
python -v setup_cuda.py install
That should hopefully provide more details.

Also, when installing from your wheel:

  File "D:\09._AI_projects\openassistant\textgen\lib\site-packages\transformers\models\llama\modeling_llama.py", line 196, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "D:\09._AI_projects\openassistant\textgen\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\09._AI_projects\openassistant\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 426, in forward
    quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
NameError: name 'quant_cuda' is not defined

May 11 '23 13:05 Paillat-dev

I have the same problem. When I used the verbose command it has showed me this error: python: can't open file 'I:\\AI\\oobabooga\\setup_cuda.py': [Errno 2] No such file or directory

That's not the same issue. Your issue means that you haven't cd repositories/GPTQ-for-LLaMa

Actually it's there

The thing is I came to this issue from this one.

I encountered with this problem when I run "update_windows.bat".

...
Successfully installed accelerate-0.17.1 datasets-2.10.1 safetensors-0.3.0 transformers-4.30.0.dev0
Processing i:\ai\oobabooga\text-generation-webui\repositories\gptq-for-llama
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: quant-cuda
  Building wheel for quant-cuda (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [7 lines of output]
      running bdist_wheel
      running build
      running build_ext
      I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error checking compiler version for cl: [WinError 2] Sistem belirtilen dosyayı bulamıyor
        warnings.warn(f'Error checking compiler version for {compiler}: {error}')
      building 'quant_cuda' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for quant-cuda
  Running setup.py clean for quant-cuda
Failed to build quant-cuda
Installing collected packages: quant-cuda
  Running setup.py install for quant-cuda ... error
  error: subprocess-exited-with-error

  × Running setup.py install for quant-cuda did not run successfully.
  │ exit code: 1
  ╰─> [9 lines of output]
      running install
      I:\AI\oobabooga\installer_files\env\lib\site-packages\setuptools\command\install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        warnings.warn(
      running build
      running build_ext
      I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error checking compiler version for cl: [WinError 2] Sistem belirtilen dosyayı bulamıyor
        warnings.warn(f'Error checking compiler version for {compiler}: {error}')
      building 'quant_cuda' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> quant-cuda

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
ERROR: GPTQ CUDA kernel compilation failed.
Attempting installation with wheel.
Collecting quant-cuda==0.0.0
  Using cached https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl (398 kB)
Installing collected packages: quant-cuda
Successfully installed quant-cuda-0.0.0
Wheel installation success!
Continuing with install..

Done!
Press any key to continue . . .

This problem still persists when I'm trying to update.

However, I remember where I came here; my original problem is My Vicuna 13B GPTQ 4bit 128g model was creating gibberish results as in here. I found a solution for the gibberish results on this post.

Now I'm receiving this kind of results when I'm trying to generate text:

Traceback (most recent call last):
  File "I:\AI\oobabooga\text-generation-webui\modules\callbacks.py", line 73, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "I:\AI\oobabooga\text-generation-webui\modules\text_generation.py", line 259, in generate_with_callback
    shared.model.generate(**kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 1565, in generate
    return self.sample(
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 2612, in sample
    outputs = self(
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 688, in forward
    outputs = self.model(
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 578, in forward
    layer_outputs = decoder_layer(
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 293, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 197, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "I:\AI\oobabooga\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "I:\AI\oobabooga\text-generation-webui\repositories\GPTQ-for-LLaMa\quant.py", line 279, in forward
    quant_cuda.vecquant4matmul(x.float(), self.qweight, out, self.scales.float(), self.qzeros, self.g_idx)
TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: int) -> None

Invoked with: tensor([[ 0.0096, -0.0412,  0.2666,  ..., -0.0141,  0.0032,  0.0063],
        [ 0.0107, -0.0175,  0.0251,  ...,  0.0078, -0.0395, -0.0337],
        [ 0.0118, -0.0182,  0.0223,  ...,  0.0210,  0.0105,  0.0080],
        ...,
        [ 0.0118, -0.0182,  0.0223,  ...,  0.0210,  0.0105,  0.0080],
        [ 0.0219, -0.0059,  0.0252,  ..., -0.0189,  0.0675,  0.0023],
        [-0.0047, -0.0214, -0.0582,  ...,  0.0385, -0.0210,  0.0089]],
       device='cuda:0'), tensor([[-1398026309,  1248435898,  1968657271,  ...,  1648788836,
          1503146616,  1432982596],
        [-1129530164, -1418999416,  1702123094,  ...,  2016756323,
           900172105, -2007726747],
        [ -876888900, -1735723655,  1717986149,  ..., -1236974524,
          1117231658, -1988667224],
        ...,
        [-1952601429,   444958643, -2041367257,  ..., -1163093398,
           571234629,  -357917764],
        [-1181108071, -1685570875,  1466185092,  ..., -1394910790,
          1094095909,  -342209878],
        [ 1449556888, -1113172008, -2025167501,  ..., -2122341162,
          1770412358,  -362050885]], device='cuda:0', dtype=torch.int32), tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0'), tensor([[0.0144, 0.0081, 0.0102,  ..., 0.0185, 0.0108, 0.0130],
        [0.0068, 0.0054, 0.0059,  ..., 0.0164, 0.0124, 0.0091],
        [0.0116, 0.0083, 0.0120,  ..., 0.0273, 0.0141, 0.0108],
        ...,
        [0.0121, 0.0061, 0.0123,  ..., 0.0220, 0.0115, 0.0122],
        [0.0140, 0.0065, 0.0158,  ..., 0.0162, 0.0131, 0.0128],
        [0.0079, 0.0063, 0.0077,  ..., 0.0153, 0.0110, 0.0111]],
       device='cuda:0'), tensor([[-2003139190,  1011225701, -1557841773,  ...,  1279628968,
         -2103822205,  1447265365],
        [ 2002999416,  1783731783,  1698191720,  ...,  1970632597,
         -2055768200,  1434806901],
        [-2023401052,   677541209, -1808521078,  ...,  1431726443,
          1937258634,  1750898759],
        ...,
        [ 1436066920,  1501980213, -1807276171,  ...,  1985377879,
          1446410390,  1737909111],
        [ 1967336307,  1197582408,  1949653097,  ..., -1839765626,
          2004449145,  1986558054],
        [ 2022204536,  1451832950,  1485342342,  ...,  1989634663,
          1734899351, -2055838345]], device='cuda:0', dtype=torch.int32), tensor([ 0,  0,  0,  ..., 39, 39, 39], device='cuda:0', dtype=torch.int32)
Output generated in 0.25 seconds (0.00 tokens/s, 0 tokens, context 45, seed 283501094)

After 2 fresh installing and many trials I've got lost here.

May 11 '23 14:05 NeoAnthropocene

Not gonna mention the whole reply

You also need to install visual cpp 2019 build tools

May 11 '23 15:05 Paillat-dev

Not gonna mention the whole reply

You also need to install visual cpp 2019 build tools

Thank you so much! C++ Build tools helped me to build the bdist wheel 😀

May 12 '23 18:05 NeoAnthropocene

I have the build tools thats why I'm so confused as to why its not working

May 15 '23 23:05 NicolasMejiaPetit

I literally fully reset my windows deleted everything, reinstalled vs pro 2019 with cpp build tools and I added it to my enviroment variables and it finally worked. I have no clue what the exact problem was but I think my download of 2019 vs was corrupt or I had multiple and that messed it up.

May 29 '23 09:05 NicolasMejiaPetit

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Aug 16 '23 23:08 github-actions[bot]

yup error: [WinError 2] The system cannot find the file specified [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for bluepy Failed to build bluepy ERROR: Could not build wheels for bluepy, which is required to install pyproject.toml-based projects issue on windows 10 with python 3.12 any fix?

Apr 21 '24 15:04 MuhammadSuffian

text-generation-webui text-generation-webui copied to clipboard

Installing GPTq for llama and cuant cuda

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard