EdmunddzzZ issues

Repositories
Issues
Comments

Results 2 issues of


                                            EdmunddzzZ

Inference without Flash Attention is Needed!

Ovis2.5 is awesome! But in [modeling_ovis2_5.py](https://huggingface.co/AIDC-AI/Ovis2.5-2B/blob/main/modeling_ovis2_5.py) line 246: ` attn_output = flash_attn_varlen_func(queries, keys, values, cu_seqlens, cu_seqlens, max_seqlen, max_seqlen).reshape(seq_length, -1) ` I made serveral tries and couldn't find a good way...

当输出超过限制时会出现卡死的情况

用Transformers库调用模型，max_new_tokens设置为20000，当生成长度为4096时会出现警告：This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or...