当输出超过限制时会出现卡死的情况

Open EdmunddzzZ opened this issue 6 months ago • 0 comments

用Transformers库调用模型，max_new_tokens设置为20000，当生成长度为4096时会出现警告：This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all. 但是模型不会结束，会持续占用GPU，并且出现卡死的情况。

Jul 06 '25 13:07 EdmunddzzZ