AlpinDale
AlpinDale
This PR adds Quadratic and Cubic Sampling methods to vLLM. Ref: oobabooga/text-generation-webui#5403 and oobabooga/text-generation-webui#5551
### Project URL https://pypi.org/project/aphrodite-engine ### Does this project already exist? - [x] Yes ### New Limit 1000 ### Update issue title - [x] I have updated the title. ### Which...
Not nearly done. No PRs will be accepted until this is done. I will write a description of every change currently in and planned.
Much work to do: alternating SWA, sliding window in flash attention, interleaved attn for mistral and gemma2.
Not sure which to use for the README, this PR currently uses the light one:  
The spec and implementation is still a WIP.