flashinfer
flashinfer copied to clipboard
[Question] Overflow risks when batch size and sequence length grows extremely large
For example, when batch size is 128, sequence length is like 32K, is there any possibility for FlashInfer's internal computation overflowing?
I'm not aware of that.
Closing based on @yzh119's answer.