flashinfer
flashinfer copied to clipboard
Feature/non contiguous kv cache
This PR solves #506
Custom strides to support non-contiguous kv cache.
Tests in test_batch_prefill_kernels.py
and test_batch_decode_kernels.py
are modified to test input kv_data on both contiguous and non-contiguous tensor.