mamba
mamba copied to clipboard
Replace mamba1 with mamba2 and training becomes very slow!
trafficstars
@torch.compile(options={"triton.cudagraphs": True}, fullgraph=True) generates an error. Is there any other way?
If you use a large model the triton overhead will be neglibile.
@torch.compile(options={“triton.cudagraphs”: True}, fullgraph=True) 生成错误。还有其他方法吗?
I encounted some questions when i chose to value mamba2 instead of mamba.
Dose it mean that i should vary the MambaConfig?