dan_the_3rd

Results 235 comments of dan_the_3rd

This is something that will eventually come into xFormers, but not in the very short term. Also I'm curious if you have any data to share regarding numerics for fp8...

Hi, We don't have builds for AMD at the moment. Might come in the future, but can't promise anything (or any ETA) at this point.

Hi, We're following closely what's happening. The current implementation has some bugs which we reported, but when it's ready it will be integrated :) https://github.com/Dao-AILab/flash-attention/issues/1052

Indeed :) We're working on this, hopefully we can have it in xFormers during next week as an experimental feature

This is taking a bit more time than expected. Hopefully we will have it by next week but not sure.

Hi, Thanks for reporting this bug. Which attention bias are you using? (`type(attn_bias)`) There are 2 solutions to unblock you immediately: (1) Set the correct cuda device for each rank...

> maybe I can set the default coda device Yes this would most likely solve your issue

Hi, What GPU do you have? How did you build xFormers? And what is your benchmark for measuring speed? Some of the components from xFormers have been integrated in PyTorch,...

So for the Titan kernels, we didn't change them in a while, and they are available as part of PyTorch now. You will get exactly the same speed/result with PyTorch's...

Updating this - we added support for Flash3 by default in xFormers. This is not yet supported in PyTorch's `scaled_dot_product_attention`, so we expect xFormers to be quite faster on H100s,...