xtuner
xtuner copied to clipboard
llama3-8b扩充词表训练RuntimeError: CUDA error: device-side assert triggered
我扩充了llama3的训练词表,并修改config中tokenizer和model中响应部分,但是报了RuntimeError: CUDA error: device-side assert triggered这样的错误。具体错误见图 tokenizer = dict( type=AutoTokenizer.from_pretrained, pretrained_model_name_or_path=new_tokens, trust_remote_code=True, padding_side='right')
model = dict(
type=SupervisedFinetune,
use_varlen_attn=use_varlen_attn,
llm=dict(
type=AutoModelForCausalLM.from_pretrained,
pretrained_model_name_or_path=pretrained_model_name_or_path,
trust_remote_code=True,
torch_dtype=torch.float16),
tokenizer=tokenizer)
并且在看log的时候,模型的embedding也做了resize(Resized token embeddings from 128256 to 129282.);
请问这个问题要怎么解决呢
破案了,发现如果是zero3启动的话就会报错,zero2不会;这个大佬们可以修复下吗