Woosuk Kwon comments

Results 282 comments of


                                            Woosuk Kwon

[core] Multi Step Scheduling

> [rank0]: File "/data/woosuk/workspace/vllm/vllm/engine/output_processor/multi_step.py", line 88, in process_outputs > [rank0]: assert valid_samples @SolitaryThinker Huge thanks for the PR! QQ: I got the above error when running benchmark scripts with num_scheduler_steps...

[WIP] Add FlexAttention to V1

> I have been testing this against latest pytorch nightly @drisspg Is torch nightly required? I'm now seeing reasonable outputs with torch v2.7.0.

[WIP] Add FlexAttention to V1

@drisspg Could you please check the failed CI tests and rebase the PR? Will merge once the CI gets green. :)

[Doc]: ROCm installation instructions do not work

cc @hongxiayang @mawong-amd Could you please take a look?

[V1][Tests] Adding additional testing for multimodal models to V1

Hmm this makes `v1-test` much longer: from 20 mins to 2.5 hours (timeout).

[Hardware][AMD] Update rocm base image and add openai server entrypoint

@hongxiayang @mawong-amd Are we ready to upgrade the ROCm and Ubuntu versions?

[V1][PP] Support PP for MultiprocExecutor

Hi @bigPYJ1151, can you please rebase the PR and resolve merge conflicts?

[V1][PP] Support PP for MultiprocExecutor

@ruisearch42 @comaniac @youkaichao Can you please take a final look by any chance?

[V1][PP] Support PP for MultiprocExecutor

@bigPYJ1151 I've just started the CI test. Will merge once it becomes green.

[V1] Feedback Thread

Hi @bao231, V1 does not support T4 or older-generation GPUs since the kernel libraries used in V1 (e.g., flash-attn) do not support them.