xformers error: NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs
NotImplementedError: No operator found for
memory_efficient_attention_forwardwith inputs: query : shape=(8, 1024, 1, 64) (torch.float32) key : shape=(8, 1024, 1, 64) (torch.float32) value : shape=(8, 1024, 1, 64) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0cutlassFis not supported because: device=cpu (supported: {'cuda'})flshattFis not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})tritonflashattFis not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})smallkFis not supported because: max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 64
Hello, thank you very much for your work. After I install xformers, I get the error above. My server has A800 graphics card, I tried from 0.0.16 to the latest version, but did not solve this problem, can you help me look at this problem, Or can you tell me exactly what version of the environment you're using? I have searched all over the Internet
And I found that there was no such problem in training, and errors in infering.
same in training how can i turn off the xformers
same in training how can i turn off the xformers
Uninstalling xformers can be solved, but this does not use flash attention.
thanks!