Linly icon indicating copy to clipboard operation
Linly copied to clipboard

我在运行时,发生了一个报错。已经换了好几个cuda版本了

Open zyr-NULL opened this issue 1 year ago • 9 comments

我运行如下代码 `python llama_infer.py --test_path prompts.txt --prediction_path result.txt \

                  --load_model_path ../ChatFlow-7B/chatflow_7b.bin  \
                  --config_path config/llama_7b_config.json \
                  --spm_model_path ../ChatFlow-7B/tokenizer.model --seq_length 512

` 我使用了cuda11.1 和11.2 都遇到了如下报错!!! ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so /opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so... normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.

请问是否有docker 镜像

zyr-NULL avatar May 25 '23 07:05 zyr-NULL

请问bitsandbytes是什么版本的?

fengyh3 avatar May 25 '23 07:05 fengyh3

你描述的这一段log好像并不是报错

fengyh3 avatar May 25 '23 07:05 fengyh3

这个好像不是错吧,bitsandbyte总是在开头打这一段。后面脚本没有继续运行吗

treya-lin avatar May 25 '23 07:05 treya-lin

你描述的这一段log好像并不是报错

但是后面并没有继续执行,bitsandbytes 版本是 0.38.1

zyr-NULL avatar May 25 '23 09:05 zyr-NULL

这个好像不是错吧,bitsandbyte总是在开头打这一段。后面脚本没有继续运行吗

问题就是后面没有继续执行,卡一会就没了

zyr-NULL avatar May 25 '23 09:05 zyr-NULL

这个好像不是错吧,bitsandbyte总是在开头打这一段。后面脚本没有继续运行吗

问题就是后面没有继续执行,卡一会就没了

我跑了一下我的环境里的,正常的话最后一句就是

CUDA SETUP: Loading binary ...

没有这句

normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.

不过你说的它卡在那里是正常的,它做完CUDA SETUP 它就会自己继续运行了,完了就会停,不会打印别的东西,不会报告运行结束,结束后可以查看到结果文件。

你的案例里它停了以后也并没有生成 --prediction_path 的文件是吗?

我的这里记录是这样的:

# 运行推理
python llama_infer.py --test_path data/test_prompt.txt --prediction_path output/test_result_new.txt --config_path config/llama_7b_config.json --spm_model_path ../models/llama_chinese/tokenizer.model --load_model_path ../models/llama_chinese/chatflow_7b.bin --seq_length 512 --world_size 1

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda113.so
/opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib64'), PosixPath('/usr/local/nvidia/lib')}
  warn(msg)
/opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so'), PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
  warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...

# 结束后就生成了结果文件了。
root@6fe0f24df893:/workspace/llama_inference# ls output/
test_result_new.txt

treya-lin avatar May 25 '23 09:05 treya-lin

+1,请问解决了吗

我运行如下代码 `python llama_infer.py --test_path prompts.txt --prediction_path result.txt \

                  --load_model_path ../ChatFlow-7B/chatflow_7b.bin  \
                  --config_path config/llama_7b_config.json \
                  --spm_model_path ../ChatFlow-7B/tokenizer.model --seq_length 512

` 我使用了cuda11.1 和11.2 都遇到了如下报错!!! ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so /opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so... normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.

请问是否有docker 镜像

wmm7777 avatar Jun 11 '23 08:06 wmm7777

+1,请问解决了吗

我运行如下代码 `python llama_infer.py --test_path prompts.txt --prediction_path result.txt \

                  --load_model_path ../ChatFlow-7B/chatflow_7b.bin  \
                  --config_path config/llama_7b_config.json \
                  --spm_model_path ../ChatFlow-7B/tokenizer.model --seq_length 512

` 我使用了cuda11.1 和11.2 都遇到了如下报错!!! ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so /opt/conda/lib/python3.7/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib'), PosixPath('/usr/local/nvidia/lib64')} warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so... normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization. 请问是否有docker 镜像

bitsandbytes需要一定的运行条件, Requirements Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0. (see https://github.com/TimDettmers/bitsandbytes) 我看到你的报错是opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda112.so,是使用了python 3.7对吧。 你可以创建一个新的环境重新开始,使用 conda create --name LLM python=3.8 conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch 即可(conda会自动下载cuda),这个环境是可以运行的。 然后再下载bitsandbytes就可以了,如果pip install bitsandbytes失败了,可以直接去 https://pypi.org/project/bitsandbytes-cuda113/#files 下载wheel文件,然后pip install. 有问题可以继续提问。

Submarinee avatar Jun 12 '23 09:06 Submarinee

请问bitsandbytes是什么版本的?

bitsandbytes用最新的0.39版本即可,但请注意bitsandbytes需要一定的运行条件, Requirements Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0. 同时需要torch是gpu版本的。

Submarinee avatar Jun 12 '23 09:06 Submarinee