vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Experimental attention backend in helion

Open bringlein opened this issue 1 month ago • 2 comments

Purpose

very experimental and draft PR so far

Test Plan

VLLM_ATTENTION_BAKCEND=EXPERIMENTAL_HELION_ATTN vllm serve meta-llama/Llama-3.1-8B-Instruct 

Test Result

t.b.a.


Essential Elements of an Effective PR Description Checklist
  • [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • [ ] The test plan, such as providing test command.
  • [ ] The test results, such as pasting the results comparison before and after, or e2e results
  • [ ] (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

bringlein avatar Oct 21 '25 20:10 bringlein

This pull request has merge conflicts that must be resolved before it can be merged. Please rebase the PR, @bringlein.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify[bot] avatar Nov 11 '25 17:11 mergify[bot]

Documentation preview: https://vllm--27293.org.readthedocs.build/en/27293/

mergify[bot] avatar Nov 14 '25 18:11 mergify[bot]

Documentation preview: https://vllm--27293.org.readthedocs.build/en/27293/

mergify[bot] avatar Nov 26 '25 12:11 mergify[bot]