ru-dalle Sparse attention support

Sparse attention support

Open neverix opened this issue 2 years ago • 0 comments

Currently, the inference code creates the entire attention matrix and then masks it. Sparse attention implementations like Triton are more efficient. Does the pre-training code support sparse attention? Will it ever be released?

Dec 04 '21 20:12 neverix

ru-dalle ru-dalle copied to clipboard

Sparse attention support

ru-dalle
ru-dalle copied to clipboard