dan_the_3rd
dan_the_3rd
Unfortunately I didn't manage to repro (Linux, Python 3.10, torch installed via pip for cu124). Not sure what's the difference in setup there...
Hi, Just wanted to follow-up on this. This is blocking the build of some components of xFormers on Windows - is there a way to do a workaround in the...
woops that's a typo. We removed conda binaries for 3.9, and added 3.11 instead
cc @bottler @sgrigory maybe this initialization can be done lazily?
As a workaround, and if you don't need any triton kernel from xformers, you can try setting this env variable: `XFORMERS_FORCE_DISABLE_TRITON=1`
Hi, We don't have documentation for this - as we consider these backends internal details that we would rather not expose publicly (because they can change). But as of today:...
What GPU are you using? Flash-Decoding is supported on `xformers.ops.fmha.flash.FwOp` and [`split_k`](https://github.com/facebookresearch/xformers/blob/main/xformers/ops/fmha/triton_splitk.py) and both require A100 or newer GPU
They are indeed the same mathematic algorithm (in terms of mathematical operations), but the way work is parallelized and scheduled is a bit different. Plus the implementation details matter a...