xformers
xformers copied to clipboard
Support pure pytorch implementation for memory_efficient_attention
🚀 Feature
I found that memory_efficient_attention
op does not support pure pytorch implementation (e.g. without using device specific op, or library or cython). The current implementations fail to dispatch on CPU device.
In the spirit of torch 2.0, we should have a python implementation for most op.
Something similar to: https://github.com/lucidrains/memory-efficient-attention-pytorch
Pure pytorch implementation is useful for testing, benchmarking, and generic support purposes.