LLaMA-Factory
LLaMA-Factory copied to clipboard
【NPU】GLM-4-9B-Chat PPO 出错
Reminder
- [X] I have read the README and searched the existing issues.
System Info
[2024-06-07 10:17:14,980] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
-
llamafactory
version: 0.7.2.dev0 - Platform: Linux-5.10.0-198.0.0.111.oe2203sp3.aarch64-aarch64-with-glibc2.34
- Python version: 3.10.14
- PyTorch version: 2.2.0 (NPU)
- Transformers version: 4.41.2
- Datasets version: 2.19.2
- Accelerate version: 0.30.1
- PEFT version: 0.11.1
- TRL version: 0.9.3
- NPU type: Ascend910B2
- CANN version: 8.0.RC2.alpha001
- DeepSpeed version: 0.13.2
Reproduction
llamafactory-cli train \
--stage ppo \
--do_train True \
--model_name_or_path ZhipuAI/glm-4-9b-chat \
--preprocessing_num_workers 16 \
--finetuning_type lora \
--template glm4 \
--flash_attn auto \
--dataset_dir data \
--dataset disc-law-sft-triplet \
--cutoff_len 8192 \
--learning_rate 5e-05 \
--num_train_epochs 3.0 \
--max_samples 100000 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 8 \
--lr_scheduler_type cosine \
--max_grad_norm 1.0 \
--logging_steps 5 \
--save_steps 100 \
--warmup_steps 0 \
--optim adamw_torch \
--packing False \
--report_to none \
--output_dir saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-44-37 \
--bf16 True \
--plot_loss True \
--ddp_timeout 180000000 \
--include_num_input_tokens_seen True \
--adapter_name_or_path saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 \
--lora_rank 8 \
--lora_alpha 16 \
--lora_dropout 0 \
--lora_target all \
--reward_model saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 \
--reward_model_type lora \
--ppo_score_norm True \
--top_k 0 \
--top_p 0.9
### Expected behavior
_No response_
### Others
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING]
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] *****************************************
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] *****************************************
[2024-06-07 10:11:03,623] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,661] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,705] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,818] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,836] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,905] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,955] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,991] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 0, device: npu:0, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
2024-06-07 10:11:17,434 - modelscope - INFO - PyTorch version 2.2.0 Found.
2024-06-07 10:11:17,436 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-06-07 10:11:17,490 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 ceb78a2ac746b5506819a47dbbf0e37c and a total number of 976 components indexed
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 7, device: npu:7, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 4, device: npu:4, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 6, device: npu:6, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 2, device: npu:2, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 1, device: npu:1, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 5, device: npu:5, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer.json
06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 3, device: npu:3, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
[WARNING|logging.py:314] 2024-06-07 10:11:19,288 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
06/07/2024 10:11:19 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:22 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
Running tokenizer on dataset (num_proc=16): 100%|█████████████████████████████████████████████████████████████████| 16000/16000 [00:38<00:00, 416.91 examples/s]
input_ids:
[151331, 151333, 151336, 198, 100698, 103309, 101138, 3837, 113094, 110590, 105177, 99312, 8994, 98379, 106170, 117921, 3837, 98546, 20, 98334, 21, 98424, 99146, 98385, 99082, 117225, 3837, 108592, 98696, 105181, 103757, 117537, 98380, 99043, 100451, 102337, 103273, 106156, 118828, 98798, 105181, 101376, 98314, 117055, 98550, 109534, 3837, 98459, 101247, 105079, 98634, 123900, 98324, 117537, 98595, 101676, 111602, 99916, 98760, 101642, 98335, 3837, 108592, 98696, 105181, 98453, 105529, 109290, 98396, 98381, 103941, 98798, 105181, 99195, 118894, 3837, 103078, 98711, 109534, 105079, 98322, 107801, 98993, 114731, 100129, 101242, 3837, 98547, 110664, 99999, 105181, 109487, 98365, 3837, 108592, 98696, 105181, 98701, 107801, 98993, 114731, 103941, 98798, 105181, 98314, 99527, 113995, 3837, 99704, 124187, 116767, 101806, 98583, 109695, 98829, 110960, 99416, 121952, 109055, 112246, 117442, 101242, 3837, 117442, 101242, 100048, 98875, 121424, 99054, 99893, 98649, 105862, 98433, 112998, 99108, 120250, 106318, 100035, 1773, 98365, 98379, 118828, 98798, 105181, 105420, 3837, 101113, 99131, 100588, 98634, 100059, 98493, 108592, 98696, 105181, 98607, 103278, 98344, 98817, 1773, 98379, 103171, 3837, 109534, 108634, 99532, 102492, 20, 11, 124206, 13, 24, 98575, 3837, 109055, 108634, 99532, 102492, 16, 11, 19, 101474, 13, 102486, 98575, 3837, 117442, 101242, 108634, 99532, 102492, 17, 11, 24, 99951, 13, 99082, 98575, 3837, 99054, 99893, 98649, 106508, 99108, 120250, 108634, 99532, 102492, 24, 11, 102114, 21, 98575, 3837, 111086, 101832, 99532, 106234, 102492, 98729, 11, 101135, 17, 13, 21, 98575, 1773, 101409, 100867, 3837, 108592, 98696, 105181, 98319, 119626, 98322, 100297, 98479, 110416, 3837, 118828, 98798, 105181, 5373, 100547, 105181, 5373, 104464, 105181, 110065, 3837, 110664, 99999, 105181, 98314, 98697, 98856, 3837, 100059, 111413, 99565, 98990, 3837, 116550, 99304, 3837, 103171, 102622, 98560, 3837, 108592, 98696, 105181, 98314, 127251, 98381, 102070, 98539, 98404, 102243, 105483, 3837, 106144, 102919, 1773, 151337]
inputs:
[gMASK] <sop> <|user|>
基于下列案件,推测可能的判决结果。
经审理查明,2015年6月21日15时许,被告人白某某在大东区小河沿公交车站乘坐被害人张某某驾驶的133路公交车,当车辆行驶至沈阳市大东区东陵西路26号附近时,被告人白某某因未能下车而与司机张某某发生争执,并在该公交车行驶中用手拉拽档杆,被证人韩某某拉开后,被告人白某某又用手拉拽司机张某某的右胳膊,导致该车失控撞向右侧马路边停放的轿车和一个路灯杆,路灯杆折断后将福锅记炖品店的牌匾砸坏。后经被害人张某某报警,公安人员赶至现场将被告人白某某传唤到案。经鉴定,公交车受损价值人民币5,189.9元,轿车受损价值人民币1,449.57元,路灯杆受损价值人民币2,927.15元,福锅记饭店牌匾受损价值人民币9,776元,本案损失价值共计人民币19,342.6元。上述事实,被告人白某某在庭审中亦无异议,被害人张某某、朱某某、詹某某陈述,证人韩某某的证言,现场勘察笔录,视听资料,鉴定结论书,被告人白某某的供述与辩解等证据证实,足以认定。 <|assistant|>
[INFO|configuration_utils.py:731] 2024-06-07 10:12:08,107 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json
[INFO|configuration_utils.py:731] 2024-06-07 10:12:08,110 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json
[INFO|configuration_utils.py:796] 2024-06-07 10:12:08,111 >> Model config ChatGLMConfig {
"_name_or_path": "/root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
},
"bias_dropout_fusion": true,
"classifier_dropout": null,
"eos_token_id": [
151329,
151336,
151338
],
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1.5625e-07,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_hidden_layers": 40,
"num_layers": 40,
"original_rope": true,
"pad_token_id": 151329,
"padded_vocab_size": 151552,
"post_layer_norm": true,
"rmsnorm": true,
"rope_ratio": 500,
"seq_length": 131072,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"vocab_size": 151552
}
[INFO|modeling_utils.py:3471] 2024-06-07 10:12:08,159 >> loading weights file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/model.safetensors.index.json
[INFO|modeling_utils.py:1519] 2024-06-07 10:12:08,160 >> Instantiating ChatGLMForConditionalGeneration model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:962] 2024-06-07 10:12:08,162 >> Generate config GenerationConfig {
"eos_token_id": [
151329,
151336,
151338
],
"pad_token_id": 151329
}
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00, 1.45it/s]
[INFO|modeling_utils.py:4280] 2024-06-07 10:12:15,224 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[INFO|modeling_utils.py:4288] 2024-06-07 10:12:15,224 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training.
[INFO|modeling_utils.py:3797] 2024-06-07 10:12:15,231 >> Generation config file not found, using a generation config created from the model config.
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 60%|██████████████████████████████████████████████████████████▏ | 6/10 [00:04<00:02, 1.35it/s]06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:15 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
Loading checkpoint shards: 70%|███████████████████████████████████████████████████████████████████▉ | 7/10 [00:05<00:02, 1.39it/s]06/07/2024 10:12:16 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00, 1.51it/s]
06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.42it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.36it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.35it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.34it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 90%|███████████████████████████████████████████████████████████████████████████████████████▎ | 9/10 [00:07<00:00, 1.19it/s]06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:18 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00, 1.19it/s]
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards: 90%|███████████████████████████████████████████████████████████████████████████████████████▎ | 9/10 [00:09<00:01, 1.02s/it]06/07/2024 10:12:20 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:20 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:21 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:10<00:00, 1.05s/it]
06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:22 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:22 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:23 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - ***** Running training *****
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num examples = 16000
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num Epochs = 3.0
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Instantaneous batch size per device = 1
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Total train batch size (w. parallel, buffer, distributed & accumulation) = 64
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Gradient Accumulation steps = 8
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num optimization epochs per batch = 4
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Total training steps = 750
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Number of trainable parameters = 21180417
0%| | 0/750 [00:00<?, ?it/s]/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
scores_processed = torch.where(scores != scores, 0.0, scores)
[rank1]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank0]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank7]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank2]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank6]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank3]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank4]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank5]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
0%| | 0/750 [00:14<?, ?it/s]
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 213 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 67 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 379 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 390 is out of bounds for dimension 1 with size 1
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 408 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 499 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 501 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
launch()
File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type
IndexError: index 488 is out of bounds for dimension 1 with size 1
[2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227860 closing signal SIGTERM
[2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227861 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227862 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227863 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227864 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227865 closing signal SIGTERM
[2024-06-07 10:12:46,451] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2227858) of binary: /data/anaconda3/envs/llama_factory/bin/python
Traceback (most recent call last):
File "/data/anaconda3/envs/llama_factory/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
return f(*args, **kwargs)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main
run(args)
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/data/LLaMA-Factory/src/llamafactory/launcher.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2024-06-07_10:12:46
host : localhost.localdomain
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 2227859)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-06-07_10:12:46
host : localhost.localdomain
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 2227858)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================