Yuchen Zeng comments

Results 8 comments of


                                            Yuchen Zeng

The in-context learning sample for pretrained model does not work as expected.

I also came across the same issue :/

The in-context learning sample for pretrained model does not work as expected.

Actually, I figured it out. This issue is caused by the update of `peft` package. The results should be reasonable if you downgrade the `peft` from 0.7.1 to 0.6.2.

_MultiThreadedRendezvous error

Same here! Did you figure it out?

Intermittent grpc._channel._InactiveRPCError / google.api_core.exceptions.InternalServerError: 500

Same here!

Quick question: Is there a non-causal optimized form of Flash Linear Attention?

Thanks! Another quick question: is there any place that I can directly use the plain flash linear attention with Triton, without adding the forget gate and chunkwise form?

Quick question: Is there a non-causal optimized form of Flash Linear Attention?

Thanks so much for your quick response!

Quick question: Is there a non-causal optimized form of Flash Linear Attention?

In this case, there is only chunk, fused_chunk and recurrent mode right? In the figure below (in GLA paper), there is a green line without using chunkwise parallel at all....

assert self.d_ssm % self.headdim == 0

After decrease the `headim`, I also encounter another issue. Here is the error message I received: ``` File /data/yzeng58/anaconda3/envs/mamba2/lib/python3.10/site-packages/mamba_ssm/ops/triton/ssd_combined.py:761, in MambaSplitConv1dScanCombinedFn.forward(ctx, zxbcdt, conv1d_weight, conv1d_bias, dt_bias, A, D, chunk_size, initial_states, seq_idx,...