Maximilian Heil

Results 1 comments of Maximilian Heil

I still run into this issue with even with the Tokenizer usign pad_to_multiple_of=8. I could only circumvent it with eager-attn instead of flash-attn