Driss Guessous
Driss Guessous
I tried setting ```python # out = F.scaled_dot_product_attention(q, k, v) out = q ``` and I am still getting the error on last nights nightly, so I think that this...
Btw recently pin pointed an issue for efficient attention #138772
I found it to only occur for efficient But it is worth trying on your end to see if this setting has an effect https://github.com/pytorch/pytorch/blob/0a38c0ec89ab393be91619393063ea30908a5d55/torch/_inductor/config.py#L398
We do emit warning information for SDPA as why kernels can't be chosen, not sure if the current CP implementation is passing that up. If you wanted to look at...
@pytorchbot merge