bytes-lost comments

Results 5 comments of


                                            bytes-lost

实现了baichuan-7B模型的LoRA微调

@hiyouga 没有出现这个错误吗？ ``` ./aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [124,0,0], thread: [51,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [124,0,0], thread: [52,0,0] Assertion...

实现了baichuan-7B模型的LoRA微调

> @hiyouga ``` [INFO|trainer.py:622] 2023-06-15 17:12:03,926 >> Using cuda_amp half precision backend [INFO|trainer.py:1779] 2023-06-15 17:12:03,933 >> ***** Running training ***** [INFO|trainer.py:1780] 2023-06-15 17:12:03,934 >> Num examples = 48,329 [INFO|trainer.py:1781] 2023-06-15...

实现了baichuan-7B模型的LoRA微调

@hiyouga 我在train_sft.py这里加上了一行，但是还是一样的报错 ``` model, tokenizer = load_pretrained(model_args, finetuning_args, training_args.do_train, stage="sft") tokenizer.pad_token_id = 0 # 指定pad_token_id dataset = preprocess_data(dataset, tokenizer, data_args, training_args, stage="sft") ```

实现了baichuan-7B模型的LoRA微调

> @bytes-lost 看起来是 torch 的 checkpointing 过程出现了问题，可能和本地的 torch 以及 CUDA 环境有关，我这边测试了好几遍都没有问题。好的，我重新创建环境测测看，torch=2.0.1版本是可以的吗？

关于中文预训练阶段的Loss情况咨询

> @Haijunlv 请问你们最终的预训练模型生成效果如何呢？loss在1.95的情况下