torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Slimming down torchchat: Replace replace_attention_with_custom_sdpa_attention() with ET's implementation

Open Jack-Khuu opened this issue 6 months ago • 0 comments

🚀 The feature, motivation and pitch

First surfaced in https://github.com/pytorch/torchchat/pull/1057, the replace_attention_with_custom_sdpa_attention function, used when exporting models in torchchat, can be replaced with the equivalent API provided in the Excecutorch https://github.com/pytorch/executorch/blob/main/examples/models/llama2/source_transformation/sdpa.py

Task: Swap the torchchat implementation with that of ExecuTorch's. Delete the then defunct code from torchchat

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Jack-Khuu avatar Aug 23 '24 23:08 Jack-Khuu