Fengyh comments

Results 22 comments of


                                            Fengyh

modified tencentpretrain/utils/constants.py L4 to with open("models/llama_special_tokens_map.json", mode="r", encoding="utf-8") as f:

可以参考： https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

看起来是环境的问题，你是什么显卡、显卡驱动是啥，bitsandbytes版本是啥？麻烦提供更多的信息～

可以参考：https://github.com/fengyh3/llama_inference

你用的是哪个模型呢？能不能提供更多的信息？

@LymanLiuChina 已修复。https://github.com/fengyh3/llama_inference/tree/main

have you tried to set "gradient_accumulation_steps" greater than 1 in deepspeed config?

可以参考一下这个推理哈～ https://github.com/fengyh3/llama_inference

please refer to: https://github.com/Tencent/TencentPretrain/blob/main/models/llama/7b_config.json

can you show more details about your pretraining? for example running bash. It seems your model config has some problems.