mamba
mamba copied to clipboard
Question about support for sequence parallel
Hi,
I recently learnt about this selective SSM architecture, and it was awesome! But I have some questions. We know that the Transformer architecture supports sequence parallelism, so does Mamba (the potential alternative of Transformer) support sequence parallelism?
In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?
In general, yes. Which flavor of sequence parallelism are you referring to? The one in Megatron-LM?
Thanks for your timely response! Sure. I am referring to the one in Megatron-LM. I am wondering does Mamba has built-in support for this kind of sequence parallel, or we need to implement it manually?
Nothing is built-in, but it'll be implemented in the future.
Got it. Thanks!