vllm [Feature]: Audit and Update Examples To Use `VLLM_USE

🚀 The feature, motivation and pitch

Many of the examples leverage V0 internals.

We should:

raise NotImplementedError if envs.VLLM_USE_V1 with these
convert them to use V1 if we can

Alternatives

No response

Additional context

No response

Before submitting a new issue...

[x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Mar 10 '25 02:03 robertgshaw2-redhat

Hi. May I look into this issue?

Mar 10 '25 03:03 devesh-2002

@robertgshaw2-redhat I would like to contribute to this issue. Would you mind to elaborate the requirements in detail? I'll start to go through the examples and start testing them under VLLM_USE_V1=1, then update or mark as not supported depending on feasibility. Let me know if that's okay.

May 09 '25 23:05 leoli1208

@robertgshaw2-redhat @njhill I’m working on this issue. At first, I planned to write a script to run all the examples with v1, but some examples require complex setups, so now I’m manually checking the implementations to see if they are based on v0 or v1.

Could you share if there are any documents or guidelines about v0 and v1 patterns or internal indicators that I should look for during this process? It would really help to know exactly what distinguishes v0 from v1 in terms of imports, APIs, or structure.

Also for some examples that are based on v0 internals and not easily convertible, should we just raise NotImplementedError if VLLM_USE_V1=1 is set, or is a full refactor preferred?

Thanks in advance for the help!

Jun 06 '25 05:06 leoli1208

I've done auditing for the offline inference part. Working on refactoring offline inferences examples. Should I submit two separated PR for auditing and refactoring since it's changed more than 10 files? @robertgshaw2-redhat @njhill

Aug 05 '25 00:08 leoli1208

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

Nov 03 '25 02:11 github-actions[bot]

I came across certain situations where V1 switchs back to V0 automatically when attention backend isn't compatible, even though LLM_USE_V1 is enabled explicitly. Is this issue still open? @robertgshaw2-redhat

Nov 17 '25 04:11 ijpq

[Feature]: Audit and Update Examples To Use `VLLM_USE_V1=1`

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...