Driss Guessous

Results 125 comments of Driss Guessous

Can you higlight the coverage and performance difference between this and the aottriton based version?

It seems as though the CK backend is universally better than the AOTriton version. Can we just fully replace the AOTriton implementation with CK so that we dont have to...

cc @eellison who I know was investigating ways to codgen this in inductor, but I think there is a valid argument for a top level grouped_gemm func in PyTorch

@pytorchbot revert "See https://github.com/pytorch/pytorch/issues/135126 for more details"

@pytorchbot revert -m "See https://github.com/pytorch/pytorch/issues/135126 for more details"

@pytorchbot revert -m "See https://github.com/pytorch/pytorch/issues/135126 for more details" -c weird

TBH I am hesitant to land this in core since the blast radius is quite high and the usage is not generic for all users, e.g. int8 seems to only...

Lets first land the cpu/gpu variant in torchao. I think that it will have better discoverability since the TorchAO project is targeted for quantized/kernel specific performance optimizations. As well we...

Hey sorry, I thought I responded. Is the the expectation for this OP that there will be no user side code changes and that this will be a pure pattern...