Firefly
Firefly copied to clipboard
训练参数'gradient_checkpointing'设置为false时导致的报错
训练参数
"output_dir": "./output",
"model_name_or_path": "Baichuan-13B-Base",
"train_file": "data/voc_train.jsonl",
"num_train_epochs": 500,
"per_device_train_batch_size": 16,
"gradient_accumulation_steps": 2,
"learning_rate": 1e-4,
"max_seq_length": 1200,
"logging_steps": 300,
"save_steps": 500,
"save_total_limit": 1,
"lr_scheduler_type": "constant_with_warmup",
"warmup_steps": 3000,
"lora_rank": 64,
"lora_alpha": 16,
"lora_dropout": 0.05,
"gradient_checkpointing": false,
"disable_tqdm": false,
"optim": "paged_adamw_32bit",
"seed": 42,
"bf16": true,
"report_to": "tensorboard",
"dataloader_num_workers": 5,
"save_strategy": "steps",
"weight_decay": 0,
"max_grad_norm": 0.3,
"remove_unused_columns": false
训练百川模型时将gradient_checkpointing设置为false,会报这个错误
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
查到的解决方案是需要把变量转为Variable格式,详情https://blog.csdn.net/weixin_41990278/article/details/90311313
把gradient_checkpointing改为true后正常运行
请问如何将变量转为Variable格式?