torchchat
torchchat copied to clipboard
Slimming down torchchat: Replace replace_attention_with_custom_sdpa_attention() with ET's implementation
🚀 The feature, motivation and pitch
First surfaced in https://github.com/pytorch/torchchat/pull/1057, the replace_attention_with_custom_sdpa_attention
function, used when exporting models in torchchat, can be replaced with the equivalent API provided in the Excecutorch https://github.com/pytorch/executorch/blob/main/examples/models/llama2/source_transformation/sdpa.py
Task: Swap the torchchat implementation with that of ExecuTorch's. Delete the then defunct code from torchchat
Alternatives
No response
Additional context
No response
RFC (Optional)
No response