flash-linear-attention icon indicating copy to clipboard operation
flash-linear-attention copied to clipboard

[RFC] Implement model-specific 4d parallelism

Open yzhangcs opened this issue 9 months ago • 1 comments

Proposal

  • We want to add apply_tp & apply_cp fns for each models as their layer definitions can be varied.

Also see comments in https://github.com/fla-org/flame/issues/4

yzhangcs avatar Jan 28 '25 10:01 yzhangcs