Fovercon

Results 1 issues of Fovercon

I'm working on the 32k long text SFT for Qwen2 72b. When I set **seq_parallel_world_size** to greater than one and **use_varlen_attn to true**, an error occurs. After checking, the error...