exllama
exllama copied to clipboard
Continuous Batching support
vLLM, and HF's TGI can do this.
Additional Context: https://github.com/turboderp/exllama/issues/150#issuecomment-1633417028