zzlol63
Results
2
issues of
zzlol63
On Windows, PyTorch does not come pre-compiled with support for FlashAttention in the `torch.nn.functional.scaled_dot_product_attention` method, whereas on Linux it does, meaning there is a performance gap between the two. This...
### Describe your use-case. The latest version of Diffusers supports being able to configure or select a specific attention backend such as FlashAttention-2/FlashAttention-3 (which supports backward pass). OneTrainer could potentially...
enhancement