vllm
vllm copied to clipboard
Add support for BLOOM
Closes #61
This PR adds the BLOOM model and modifies the paged attention kernel to support ALiBi bias.