mamba
mamba copied to clipboard
Replace the attention module
I can use the mamba network directly to the nn. MultiheadAttention?If you can, can you tell me how to add nn. The method of calling MultiheadAttention (q, k, value=src, attn_mask=src_mask, key_padding_mask=src_key_padding_mask) is modified