ChatGLM2-6B
ChatGLM2-6B copied to clipboard
[BUG/Help] <NameError: name 'round_up' is not defined>
Is there an existing issue for this?
- [x] I have searched the existing issues
Current Behavior
ptuning 执行训练时报错 bash train.sh
# bash train.sh
master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified.
07/20/2023 10:27:41 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
07/20/2023 10:27:41 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=16,
gradient_checkpointing=False,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=0.02,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=output/adgen-chatglm2-6b-pt-128-2e-2/runs/Jul20_10-27-41_dsw-69238-658d7665d8-4svjr,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=10,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=3000,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_hf,
optim_args=None,
output_dir=output/adgen-chatglm2-6b-pt-128-2e-2,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=1,
per_device_train_batch_size=1,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard', 'wandb'],
resume_from_checkpoint=None,
run_name=output/adgen-chatglm2-6b-pt-128-2e-2,
save_on_each_node=False,
save_safetensors=False,
save_steps=1000,
save_strategy=steps,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
sortish_sampler=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
07/20/2023 10:27:42 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-38ff07e29e92a00b/0.0.0/fe5dd6ea2639a6df622901539cb550cf8797e5a6b2dd7af1cf934bed8e233e6e)
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 155.69it/s]
[INFO|configuration_utils.py:710] 2023-07-20 10:27:42,585 >> loading configuration file /mnt/workspace/chatglm2-6b/config.json
[INFO|configuration_utils.py:710] 2023-07-20 10:27:42,588 >> loading configuration file /mnt/workspace/chatglm2-6b/config.json
[INFO|configuration_utils.py:768] 2023-07-20 10:27:42,589 >> Model config ChatGLMConfig {
"_name_or_path": "/mnt/workspace/chatglm2-6b",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration"
},
"bias_dropout_fusion": true,
"eos_token_id": 2,
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1e-05,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 28,
"original_rope": true,
"pad_token_id": 0,
"padded_vocab_size": 65024,
"post_layer_norm": true,
"pre_seq_len": null,
"prefix_projection": false,
"quantization_bit": 0,
"rmsnorm": true,
"seq_length": 32768,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.31.0",
"use_cache": true,
"vocab_size": 65024
}
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:1837] 2023-07-20 10:27:42,592 >> loading file tokenizer_config.json
[INFO|modeling_utils.py:2600] 2023-07-20 10:27:42,772 >> loading weights file /mnt/workspace/chatglm2-6b/pytorch_model.bin.index.json
[INFO|configuration_utils.py:599] 2023-07-20 10:27:42,773 >> Generate config GenerationConfig {
"_from_model_config": true,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [01:09<00:00, 9.90s/it]
[INFO|modeling_utils.py:3329] 2023-07-20 10:28:52,223 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[WARNING|modeling_utils.py:3331] 2023-07-20 10:28:52,223 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /mnt/workspace/chatglm2-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2949] 2023-07-20 10:28:52,225 >> Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
07/20/2023 10:28:52 - WARNING - transformers_modules.chatglm2-6b.quantization - Failed to load cpm_kernels:CUDA Runtime Error: CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
File "/mnt/workspace/ChatGLM2-6B/ptuning/main.py", line 411, in <module>
main()
File "/mnt/workspace/ChatGLM2-6B/ptuning/main.py", line 127, in main
model = model.quantize(model_args.quantization_bit)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 1191, in quantize
self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 155, in quantize
layer.self_attention.query_key_value = QuantizedLinear(
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 139, in __init__
self.weight = compress_int4_weight(self.weight)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/quantization.py", line 76, in compress_int4_weight
blockDim = (min(round_up(m, 32), 1024), 1, 1)
NameError: name 'round_up' is not defined
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 321) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-07-20_10:28:57
host : dsw-69238-658d7665d8-4svjr
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 321)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Expected Behavior
No response
Steps To Reproduce
bash train.sh
Environment
- OS:Ubuntu 22.04.1 LTS
- Python:3.10.6
- Transformers:4.31.0
- PyTorch:2.0.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True
Anything else?
No response
遇到了一样的问题
解决了,要么取消--quantization_bit 4, 要么pip install cpm_kernels
这是什么原因? install cpm_kernels 还是不行,去掉量化 --quantization_bit 4,就可以了
环境:Ubuntu、PyTorch:2.0.1、NVIDIA T4
我也遇到一样的问题,开始是显卡驱动太老了。更新显卡驱动后报这个错误。去掉量化 --quantization_bit 4不行,更新nstall cpm_kernels ,这个问题不报错了,抛了个pytorch-runtime异常,没办法只能升级了 Ubuntu系统到22.04,然后更新pytorch到2.1版本,python是3.11版本。 就没报错了。感觉是Ubuntu显卡驱动和pytorch版本需要兼容。不知道对不对