BELLE 请问quant_cuda这个library要怎样正常import呢？

如题，我的环境是 python 3.9+ pytorch 1.9.0 + cuda11.2加载4bits的模型，但是我通过setup_cuda.py生成的quant_cuda.cpp貌似不能正常的import。这个文件是放在gptq那个文件夹的根目录底下吗？

Apr 03 '23 03:04 another1s

应该是依赖没有安装完整，可以按照readme执行

conda create --name gptq python=3.9 -y
conda activate gptq
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
# Or, if you're having trouble with conda, use pip with python3.9:
# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

pip install -r requirements.txt
python setup_cuda.py install

# Benchmark performance for FC2 layer of LLaMa-7B
CUDA_VISIBLE_DEVICES=0 python test_kernel.py

Apr 03 '23 04:04 mabaochang

应该是依赖没有安装完整，可以按照readme执行

conda create --name gptq python=3.9 -y
conda activate gptq
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
# Or, if you're having trouble with conda, use pip with python3.9:
# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

pip install -r requirements.txt
python setup_cuda.py install

# Benchmark performance for FC2 layer of LLaMa-7B
CUDA_VISIBLE_DEVICES=0 python test_kernel.py

感谢回复！因为机器限制的缘故，我基本只能从清华的镜像下载这些包...所以不太能直接照搬readme的指令目前我的环境是 python3.9+pytorch2.0.0 + cuda11.2，torchvision0.13.0, 其他列在requirements.txt里的都装好了但在运行 CUDA_VISIBLE_DEVICES=0 python test_kernel.py 时会报如下错误

quant_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor6deviceEv

这个感觉还是cpp extension没有正常编译，后面的inference也会报cuda extension not installed。。。这个是不是意味着一定得在cuda11.7下才能正常工作呢

Apr 03 '23 08:04 another1s

我这边在colab上跑给的colab.notebook的时候, 在最后一步也报了

[/content/BELLE/gptq/quant.py](https://localhost:8080/#) in forward(self, x)
    259             quant_cuda.vecquant3matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    260         elif self.bits == 4:
--> 261             quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    262         elif self.bits == 8:
    263             quant_cuda.vecquant8matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)

NameError: name 'quant_cuda' is not defined

但是前面test_kernel.py是正常的, 同时也同样会报cuda extension not installed

Apr 06 '23 10:04 gfgkmn

的colab.notebook的时候, 在最后

我test_kernel就过不了。。。说是要cuda11.7和gcc>=6.0.0才能完成compile

Apr 07 '23 02:04 another1s

我这边在colab上跑给的colab.notebook的时候, 在最后一步也报了

[/content/BELLE/gptq/quant.py](https://localhost:8080/#) in forward(self, x)
    259             quant_cuda.vecquant3matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    260         elif self.bits == 4:
--> 261             quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    262         elif self.bits == 8:
    263             quant_cuda.vecquant8matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)

NameError: name 'quant_cuda' is not defined

但是前面test_kernel.py是正常的, 同时也同样会报cuda extension not installed

我也是同样的问题，有解决的办法吗

Apr 07 '23 04:04 Xulong-XD

我这边在colab上跑给的colab.notebook的时候, 在最后一步也报了

[/content/BELLE/gptq/quant.py](https://localhost:8080/#) in forward(self, x)
    259             quant_cuda.vecquant3matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    260         elif self.bits == 4:
--> 261             quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
    262         elif self.bits == 8:
    263             quant_cuda.vecquant8matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)

NameError: name 'quant_cuda' is not defined

但是前面test_kernel.py是正常的, 同时也同样会报cuda extension not installed

我也是同样的问题，有解决的办法吗

我运行两次 test_kernel.py 在第二次的时候就不报 cuda extension not installed 但是在推理的时候任然是同样的错误

Apr 10 '23 08:04 answerMA

要不试试这个复刻版本belle，依赖比较少

Apr 10 '23 15:04 Tongjilibo

root@81f107f7c720:/workspace/code/models/gptq# CUDA_VISIBLE_DEVICES=0 python3 bloom_inference.py /workspace/models/BELLE_BLOOM_GPTQ_4BIT --wbits 4 --groupsize 128 --load /workspace/models/BELLE_BLOOM_GPTQ_4BIT/bloom7b-2m-4bit-128g.pt --text "this is llama"
Loading model ...
Done.
Human:
介绍一下中国
Assistant:

 

中国是一个亚洲国家，位于亚洲东部沿海。它是世界上人口最多的国家，拥有着悠久的历史文化和丰富的自然风景资源。中国是世界上最富有文化和历史的国家之一，有着许多著名的文化和历史遗产，如长城、故宫、兵马俑等。中国也是世界上最大的制造业和出口国之一，拥有着许多世界级的公司和品牌。同时，中国也是世界上最大的消费市场之一，拥有着庞大的中产阶级和消费群体。</s>

-------------------------------

Human:
详细介绍一下李白
Assistant:

 李白是唐代著名诗人，被誉为“诗仙”。他出生于唐代初期，出生于一个书香门第，擅长诗歌创作。他的诗歌创作内容广泛，包括山水、田园、自然、人生等等。他善于描写自然景物和人类情感，风格豪放，情感深刻，影响了许多后来的诗人。他的代表作品有《将进酒》、《庐山谣》、《夜泊牛渚怀古》等。李白的诗歌风格豪放洒脱，气势磅礴，具有浪漫主义色彩，对后世影响深远。</s>

-------------------------------

Human:
李白有哪些有名的诗句
Assistant:

 

1. "天生我材必有用，千金散尽还复来。" 
2. "我欲乘风归去，又恐遭此中而来。"
3. "且将新火试新茶，文章只自写知名。"
4. "人生得意须尽欢，千金散尽还复来。" 
5. "此中有真意，欲辨已忘言。"</s>

-------------------------------

我这边终于能正常推理了，说一下我的步骤：

克隆项目 git clone https://github.com/LianjiaTech/BELLE.git
进入 BELLE 目录安装依赖包：pip3 install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
进入 BELLE/models/gptq 目录安装依赖: pip3 install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
关键的一步来了，在 BELLE/models/gptq 目录下安装 quant_cuda，执行命令：python3 setup_cuda.py install
验证是否安装成功，执行命令：CUDA_VISIBLE_DEVICES=0 python3 test_kernel.py
下载量化模型，我下载的是 https://huggingface.co/BelleGroup/BELLE_BLOOM_GPTQ_4BIT 这个量化模型包
推理：CUDA_VISIBLE_DEVICES=0 python3 bloom_inference.py /workspace/models/BELLE_BLOOM_GPTQ_4BIT --wbits 4 --groupsize 128 --load /workspace/models/BELLE_BLOOM_GPTQ_4BIT/bloom7b-2m-4bit-128g.pt --text "this is llama"

第4步骤很关键，我当时就是忘记执行这个步骤了，折腾好久。 https://github.com/LianjiaTech/BELLE/blob/main/models/notebook/BELLE_INFER_COLAB.ipynb 这个文件里面也有详细的执行说明

Jun 01 '23 06:06 ray-008

BELLE BELLE copied to clipboard

请问quant_cuda这个library要怎样正常import呢？

BELLE
BELLE copied to clipboard