flash-linear-attention
flash-linear-attention copied to clipboard

Published 20 hours ago •

Reame
Issues

[RFC] Implement model-specific 4d parallelism

Open yzhangcs opened this issue 9 months ago • 1 comments

Proposal

We want to add apply_tp & apply_cp fns for each models as their layer definitions can be varied.

Also see comments in https://github.com/fla-org/flame/issues/4

Jan 28 '25 10:01 yzhangcs