Fengyh

Results 22 comments of Fengyh

modified tencentpretrain/utils/constants.py L4 to with open("models/llama_special_tokens_map.json", mode="r", encoding="utf-8") as f:

可以参考: https://github.com/Tencent/TencentPretrain/blob/main/scripts/convert_tencentpretrain_to_llama.py

看起来是环境的问题,你是什么显卡、显卡驱动是啥,bitsandbytes版本是啥?麻烦提供更多的信息~

可以参考:https://github.com/fengyh3/llama_inference

你用的是哪个模型呢?能不能提供更多的信息?

@LymanLiuChina 已修复。https://github.com/fengyh3/llama_inference/tree/main

have you tried to set "gradient_accumulation_steps" greater than 1 in deepspeed config?

可以参考一下这个推理哈~ https://github.com/fengyh3/llama_inference

please refer to: https://github.com/Tencent/TencentPretrain/blob/main/models/llama/7b_config.json

can you show more details about your pretraining? for example running bash. It seems your model config has some problems.