FinGPT icon indicating copy to clipboard operation
FinGPT copied to clipboard

when I run finetune.sh shell,RuntimeError: Only Tensors of floating point and complex dtype can require gradients

Open xiaozhao1795 opened this issue 1 year ago • 1 comments

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/ lib64')} warn(msg) /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! warn(msg) CUDA SETUP: Highest compute capability among GPUs detected: 7.0 CUDA SETUP: Detected CUDA version 117 /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU! warn(msg) CUDA SETUP: Loading binary /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so... /home/ubuntu/.local/lib/python3.8/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavai lable. warn("The installed version of bitsandbytes was compiled without GPU support. " No compiled kernel found. Compiling kernels : /home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/quantization_kernels_parallel.c -shared -o /home/ubuntu/.cache/huggingface/modules/t ransformers_modules/chatglm-6b-int8/quantization_kernels_parallel.so Load kernel : /home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/quantization_kernels_parallel.so Setting CPU quantization kernel threads to 5 Using quantization cache Applying quantization to glm layers Traceback (most recent call last): File "finetune.py", line 137, in main() File "finetune.py", line 90, in main model = AutoModel.from_pretrained( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 479, in from_pretrained return model_class.from_pretrained( File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2675, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/modeling_chatglm.py", line 1061, in init self.quantize(self.config.quantization_bit, self.config.quantization_embeddings, use_quantization_cache=True, empty_init=True) File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/modeling_chatglm.py", line 1439, in quantize self.transformer = quantize(self.transformer, bits, use_quantization_cache=use_quantization_cache, empty_init=empty_init, **kwargs) File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/quantization.py", line 501, in quantize layer.attention.query_key_value = QuantizedLinearWithPara( File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/chatglm-6b-int8/quantization.py", line 374, in init self.weight = Parameter(self.weight.to(kwargs["device"]), requires_grad=False) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1632, in setattr self.register_parameter(name, value) File "/home/ubuntu/.local/lib/python3.8/site-packages/accelerate/big_modeling.py", line 108, in register_empty_parameter module._parameters[name] = param_cls(module._parameters[name].to(device), **kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/parameter.py", line 36, in new return torch.Tensor._make_subclass(cls, data, requires_grad) RuntimeError: Only Tensors of floating point and complex dtype can require gradients

环境: ubuntu GPU: [Tesla V100 SXM2 32GB] GPU是 32G的 config.json { "_name_or_path": "THUDM/chatglm-6b-int8", "architectures": [ "ChatGLMModel" ], "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration" }, "bos_token_id": 130004, "eos_token_id": 130005, "gmask_token_id": 130001, "hidden_size": 4096, "inner_hidden_size": 16384, "layernorm_epsilon": 1e-05, "mask_token_id": 130000, "max_sequence_length": 2048, "model_type": "chatglm", "num_attention_heads": 32, "num_layers": 28, "pad_token_id": 3, "position_encoding_2d": true, "quantization_bit": 0, "quantization_embeddings": false, "torch_dtype": "float16", "transformers_version": "4.27.1", "use_cache": true, "vocab_size": 130528 }

xiaozhao1795 avatar Jun 29 '23 08:06 xiaozhao1795

You may try to use the same training environment by installing the packages in the requirements.txt by the following command:

pip install -r requirements.txt

oliverwang15 avatar Jul 11 '23 13:07 oliverwang15