text-generation-webui no module named monkeypatch

Describe the bug

during training of 4bit models I am getting no module named monkeyptach ,tried installing it from pypi and the project page git hub but the setup.py give error how do i install monkeypatch

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

python server.py --monkey-patch

Screenshot

No response

Logs

no module named monkeypatch

System Info

dell G7
gpu 1060 M 6 GB

May 10 '23 04:05 phoenix8875

Same

May 10 '23 06:05 avatarproject123

Have you followed this?

https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#using-loras-in-4-bit-mode

May 10 '23 06:05 Bikkies

open the cmdwindows.bat then do pip install monkeypatch, you have to install the model after you have updated and installed 4bit lora through a vs 2019 shell. ( you will probably run into a building wheel error try these commands in the vs shell before you install the setup.py.)

Powershell:

$env:DISTUTILS_USE_SDK = 1

CMD:

SET DISTUTILS_USE_SDK=1

May 10 '23 21:05 NicolasMejiaPetit

Successfully installed accelerate-0.18.0 datasets-2.10.1 safetensors-0.3.0 transformers-4.28.0 running install C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools_distutils\cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated. !!

    ********************************************************************************
    Please avoid running ``setup.py`` directly.
    Instead, use pypa/build, pypa/installer, pypa/build or
    other standards-based tools.

    See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
    ********************************************************************************

!! self.initialize_options() C:\Users\Nick\Desktop\4-29-23\oobabooga_windows\installer_files\env\lib\site-packages\setuptools_distutils\cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated. !!

    ********************************************************************************
    Please avoid running ``setup.py`` and ``easy_install``.
    Instead, use pypa/build, pypa/installer, pypa/build or
    other standards-based tools.

    See https://github.com/pypa/setuptools/issues/917 for details.
    ********************************************************************************

!! self.initialize_options() running bdist_egg running egg_info writing quant_cuda.egg-info\PKG-INFO writing dependency_links to quant_cuda.egg-info\dependency_links.txt writing top-level names to quant_cuda.egg-info\top_level.txt reading manifest file 'quant_cuda.egg-info\SOURCES.txt' writing manifest file 'quant_cuda.egg-info\SOURCES.txt' installing library code to build\bdist.win32\egg running install_lib running build_ext error: [WinError 2] The system cannot find the file specified

Done! Press any key to continue . . .

I did everything and set the SET DISTUTILS_USE_SDK=1 but I get this

May 10 '23 21:05 NicolasMejiaPetit

Make sure to follow the instructions here:

https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#using-loras-in-4-bit-mode

May 12 '23 02:05 oobabooga

Make sure to follow the instructions here:

https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#using-loras-in-4-bit-mode

Does installing monkeypatch breaks the normal LORA training in 8-bit? I'm asking because I would love to experiment with this without breaking what works right now. Cheers! (This repo rocks!)

May 13 '23 04:05 FartyPants

make sure you update your locally installed files by typing into cmd git pull https://github.com/oobabooga/text-generation-webui/

once that is done, go into the text-generation-webui folder and type cmd into the address bar to open cmd type as follows pip install -r requirements.txt

that seems to have solved it for me as of may 21st

May 22 '23 01:05 mongrelite

I am also getting the no monkey patch module I even tried to pip install it through the x86 native command prompt tools but that still wouldn't work. I've been able to compile all the wheels required so I don't think that is the issue. (I have updated the web ui and I have every wheel compiled and working, but this module)

(C:\Users\PC\Desktop\oobabooga_windows\installer_files\env) C:\Users\PC\Desktop\oobabooga_windows>pip install monkeypatch --no-cache-dir
Collecting monkeypatch
  Downloading monkeypatch-0.1rc3.zip (7.9 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [7 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\PC\Desktop\oobabooga_windows\installer_files\pip-install-wf1isevg\monkeypatch_6bfa9428e09a4590a233315bdfe4dc7d\setup.py", line 99
          except ImportError, e:
                 ^^^^^^^^^^^^^^
      SyntaxError: multiple exception types must be parenthesized
      [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details.

(C:\Users\PC\Desktop\oobabooga_windows\installer_files\env) C:\Users\PC\Desktop\oobabooga_windows>

its telling me line 90 in the setup.py I downloaded the module and looked inside the file and at line 90 it says (line 84-105):

if use_distribute:
      from distribute_setup import use_setuptools
      use_setuptools(to_dir=dist)
      from setuptools import setup
  else:
      try:
          from setuptools import setup
      except ImportError:
          from distutils.core import setup

  if use_stdeb:
      import platform
      if 'debian' in platform.dist():
          try:
              import stdeb
          except ImportError, e:
              pass

  return setup

if __name__ == '__main__':
  main()

May 30 '23 18:05 NicolasMejiaPetit

Make sure to follow the instructions here for the gptq-for-llama lora monkey patch:

https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#using-loras-with-gptq-for-llama

May 30 '23 20:05 oobabooga

I followed the instructions exactly(like 15 times because I thought something I did was wrong), I even deleted my entire pc and started from a blank windows install, and just vs 2019 build tools. everything but this works. It tells me no module named monkey patch and when I try to pip install it, I get that error in the pip setup.py in line 90.

May 31 '23 05:05 NicolasMejiaPetit

I followed the instructions exactly(like 15 times because I thought something I did was wrong), I even deleted my entire pc and started from a blank windows install, and just vs 2019 build tools. everything but this works. It tells me no module named monkey patch and when I try to pip install it, I get that error in the pip setup.py in line 90.

follow below instructons-:

make a folder repo in text generation folder

Clone johnsmith0031/alpaca_lora_4bit into the repo folder:

cd text-generation-webui/repositories

git clone https://github.com/johnsmith0031/alpaca_lora_4bit

Now open server.py in any code Editor and add this

import os sys.path.append("#/path to cloned repo ") example- sys.path.append("text-generation-webui/repositories/alpaca_lora_4bit") Install GPTQ-for-LLaMa with this command:

pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit

Start the UI with the --monkey-patch flag:

python server.py --monkey-patch

May 31 '23 16:05 phoenix8875

I'm getting the same error as @NickWithBotronics, tried everything mentioned here and on the documentation.

ModuleNotFoundError: No module named 'monkeypatch'

I'm unable to install it through pip, exact error as already stated by Nick.

Jun 08 '23 15:06 Lumoria

AutoGPTQ is going to add LoRA support soon. For now you need to install it from source using the commands here: https://github.com/PanQiWei/AutoGPTQ#install-from-source

After that, load a GPTQ model as usual (without --gptq-for-llama) and apply the LoRA. It only works for single LoRAs at the moment.

Jun 08 '23 15:06 oobabooga

I'm getting the same error as @NickWithBotronics, tried everything mentioned here and on the documentation.

ModuleNotFoundError: No module named 'monkeypatch'

I'm unable to install it through pip, exact error as already stated by Nick.

You're not required to install monkey patch just add the cloned repo in system path

Jun 08 '23 16:06 phoenix8875

I don't know what else to say, I've added the cloned repo to path, same error.

Jun 08 '23 16:06 Lumoria

I also have the identical error in linux when trying to lora train with monkeypatch. Followed documentation for 4-bits, cloning johnsmith0031 and sterlind. Updated webui, and also blew away existing repository gptq-for-llama and updated again (again according to documentation.) Getting no module called monkey patch. Got exactly the same error when trying the pip install.

No problems with loading / inferencing models.

Linux Mint 21 RTX 3090

Update: Cured by explicitly checking the "use gptq for llama" when loading model.

Jun 13 '23 02:06 afoland

Trying to get monkey-patch installed so I can train lora in 4bit. Followed the instructions at https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md

Terminal output of install process and errors below:

ana@durga:~/oobabooga_linux/text-generation-webui/repositories$ git clone https://github.com/johnsmith0031/alpaca_lora_4bit Cloning into 'alpaca_lora_4bit'... remote: Enumerating objects: 1029, done. remote: Counting objects: 100% (382/382), done. remote: Compressing objects: 100% (125/125), done. remote: Total 1029 (delta 301), reused 264 (delta 256), pack-reused 647 Receiving objects: 100% (1029/1029), 582.62 KiB | 4.38 MiB/s, done. Resolving deltas: 100% (623/623), done. ana@durga:~/oobabooga_linux/text-generation-webui/repositories$ pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit Defaulting to user installation because normal site-packages is not writeable Collecting git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit Cloning https://github.com/sterlind/GPTQ-for-LLaMa.git (to revision lora_4bit) to /tmp/pip-req-build-t47y4u6m Running command git clone --filter=blob:none --quiet https://github.com/sterlind/GPTQ-for-LLaMa.git /tmp/pip-req-build-t47y4u6m Running command git checkout -b lora_4bit --track origin/lora_4bit Switched to a new branch 'lora_4bit' Branch 'lora_4bit' set up to track remote branch 'lora_4bit' from 'origin'. Resolved https://github.com/sterlind/GPTQ-for-LLaMa.git to commit 8bfa7a4a35e72ae853722dbfd2e4d88afc736536 Preparing metadata (setup.py) ... done Requirement already satisfied: torch in /home/ana/.local/lib/python3.10/site-packages (from gptq-llama==0.2.3) (2.0.0) Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.7.4.91) Requirement already satisfied: sympy in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (1.11.1) Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.7.99) Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.10.3.66) Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.4.0.1) Requirement already satisfied: typing-extensions in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (4.5.0) Requirement already satisfied: networkx in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (3.1) Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.7.101) Requirement already satisfied: triton==2.0.0 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (2.0.0) Requirement already satisfied: jinja2 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (3.1.2) Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (10.9.0.58) Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (8.5.0.96) Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (2.14.3) Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.7.99) Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (11.7.91) Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch->gptq-llama==0.2.3) (3.6.0) Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /home/ana/.local/lib/python3.10/site-packages (from torch->gptq-llama==0.2.3) (10.2.10.91) Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch->gptq-llama==0.2.3) (59.6.0) Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch->gptq-llama==0.2.3) (0.37.1) Requirement already satisfied: cmake in /home/ana/.local/lib/python3.10/site-packages (from triton==2.0.0->torch->gptq-llama==0.2.3) (3.26.3) Requirement already satisfied: lit in /home/ana/.local/lib/python3.10/site-packages (from triton==2.0.0->torch->gptq-llama==0.2.3) (16.0.2) Requirement already satisfied: MarkupSafe>=2.0 in /usr/lib/python3/dist-packages (from jinja2->torch->gptq-llama==0.2.3) (2.0.1) Requirement already satisfied: mpmath>=0.19 in /home/ana/.local/lib/python3.10/site-packages (from sympy->torch->gptq-llama==0.2.3) (1.3.0) Building wheels for collected packages: gptq-llama Building wheel for gptq-llama (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [111 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.linux-x86_64-3.10 creating build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/quant.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/init.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/datautils.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/opt.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/modelutils.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/gptq.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama_inference.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/convert_llama_weights_to_hf.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/test_performance.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama_inference_offload.py -> build/lib.linux-x86_64-3.10/gptq_llama package init file 'src/gptq_llama/quant_cuda/init.py' not found (or not a regular file) running build_ext /home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:388: UserWarning: The detected CUDA version (11.5) has a minor version mismatch with the version that was used to compile PyTorch (11.7). Most likely this shouldn't be a problem. warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda)) building 'gptq_llama.quant_cuda' extension creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10 creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda Emitting ninja build file /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] c++ -MMD -MF /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda.cpp -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 [2/2] /usr/bin/nvcc -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda_kernel.cu -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17 FAILED: /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o /usr/bin/nvcc -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda_kernel.cu -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17 /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero detected during: instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator==(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" (61): here instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator!=(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/core/TensorImpl.h(77): here

  /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero
            detected during:
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator==(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  (61): here
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator!=(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  /home/ana/.local/lib/python3.10/site-packages/torch/include/ATen/core/qualified_name.h(73): here
  
  /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
    435 |         function(_Functor&& __f)
        |                                                                                                                                                 ^
  /usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
  /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
    530 |         operator=(_Functor&& __f)
        |                                                                                                                                                  ^
  /usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-req-build-t47y4u6m/setup.py", line 4, in <module>
      setup(
    File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
      return distutils.core.setup(**attrs)
    File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 299, in run
      self.run_command('build')
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3.10/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
      _build_ext.run(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
      self.build_extensions()
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
      build_ext.build_extensions(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions
      self._build_extensions_serial()
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial
      self.build_extension(ext)
    File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 202, in build_extension
      _build_ext.build_extension(self, ext)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension
      objects = self.compiler.compile(sources,
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for gptq-llama Running setup.py clean for gptq-llama Failed to build gptq-llama Installing collected packages: gptq-llama Running setup.py install for gptq-llama ... error error: subprocess-exited-with-error

× Running setup.py install for gptq-llama did not run successfully. │ exit code: 1 ╰─> [115 lines of output] running install /usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( running build running build_py creating build creating build/lib.linux-x86_64-3.10 creating build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/quant.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/init.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/datautils.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/opt.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/modelutils.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/gptq.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama_inference.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/convert_llama_weights_to_hf.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/test_performance.py -> build/lib.linux-x86_64-3.10/gptq_llama copying src/gptq_llama/llama_inference_offload.py -> build/lib.linux-x86_64-3.10/gptq_llama package init file 'src/gptq_llama/quant_cuda/init.py' not found (or not a regular file) running build_ext /home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py:388: UserWarning: The detected CUDA version (11.5) has a minor version mismatch with the version that was used to compile PyTorch (11.7). Most likely this shouldn't be a problem. warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda)) building 'gptq_llama.quant_cuda' extension creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10 creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama creating /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda Emitting ninja build file /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] c++ -MMD -MF /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda.cpp -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 [2/2] /usr/bin/nvcc -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda_kernel.cu -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17 FAILED: /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o /usr/bin/nvcc -I/home/ana/.local/lib/python3.10/site-packages/torch/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/ana/.local/lib/python3.10/site-packages/torch/include/TH -I/home/ana/.local/lib/python3.10/site-packages/torch/include/THC -I/usr/include/python3.10 -c -c /tmp/pip-req-build-t47y4u6m/src/gptq_llama/quant_cuda/quant_cuda_kernel.cu -o /tmp/pip-req-build-t47y4u6m/build/temp.linux-x86_64-3.10/src/gptq_llama/quant_cuda/quant_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++17 /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero detected during: instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator==(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" (61): here instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator!=(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/core/TensorImpl.h(77): here

  /home/ana/.local/lib/python3.10/site-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero
            detected during:
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator==(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  (61): here
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator!=(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  /home/ana/.local/lib/python3.10/site-packages/torch/include/ATen/core/qualified_name.h(73): here
  
  /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
    435 |         function(_Functor&& __f)
        |                                                                                                                                                 ^
  /usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
  /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
    530 |         operator=(_Functor&& __f)
        |                                                                                                                                                  ^
  /usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-req-build-t47y4u6m/setup.py", line 4, in <module>
      setup(
    File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
      return distutils.core.setup(**attrs)
    File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 68, in run
      return orig.install.run(self)
    File "/usr/lib/python3.10/distutils/command/install.py", line 619, in run
      self.run_command('build')
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3.10/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
      _build_ext.run(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
      self.build_extensions()
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
      build_ext.build_extensions(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions
      self._build_extensions_serial()
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial
      self.build_extension(ext)
    File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 202, in build_extension
      _build_ext.build_extension(self, ext)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension
      objects = self.compiler.compile(sources,
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/home/ana/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

× Encountered error while trying to install package. ╰─> gptq-llama

note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.

End of terminal output

Sorry about the formatting, I just copy and pasted. I did not specify code or crossout. Any help would be appreciated. Even better would be to have a monkey-patch install option somewhere(could be on initial install or perhaps in the interface tab.)

Jun 30 '23 22:06 caterpillarpants

What is the current state? I have filled the official guide and it doesn't seem to work. https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#using-loras-with-gptq-for-llama

Sep 05 '23 16:09 ziat007

text-generation-webui text-generation-webui copied to clipboard

no module named monkeypatch

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard