Yuchen Zeng
Yuchen Zeng
I also came across the same issue :/
Actually, I figured it out. This issue is caused by the update of `peft` package. The results should be reasonable if you downgrade the `peft` from 0.7.1 to 0.6.2.
Same here! Did you figure it out?
Same here!
Thanks! Another quick question: is there any place that I can directly use the plain flash linear attention with Triton, without adding the forget gate and chunkwise form?
Thanks so much for your quick response!
In this case, there is only chunk, fused_chunk and recurrent mode right? In the figure below (in GLA paper), there is a green line without using chunkwise parallel at all....
After decrease the `headim`, I also encounter another issue. Here is the error message I received: ``` File /data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:761, in MambaSplitConv1dScanCombinedFn.forward(ctx, zxbcdt, conv1d_weight, conv1d_bias, dt_bias, A, D, chunk_size, initial_states, seq_idx,...