Woosuk Kwon issues

Results 65 issues of


                                            Woosuk Kwon

[Misc] Use torch.compile for basic custom ops

This PR introduces `torch.compile` for the following basic custom ops: activations and RMSNorm. The main goals are: 1. Reduce the number of custom kernels maintained by vLLM. (I intentionally kept...

[TPU] Use Ray for default distributed backend

tpu

Add `vllm_v1`

[TPU] Correctly profile peak memory usage & Upgrade PyTorch XLA

Should be merged after #9437 and after the 10/17 version of PyTorch XLA nightly is available. This PR upgrades the PyTorch XLA, and uses the `peak_bytes_used` to correctly profile the...

tpu

[V1] Get input tokens from scheduler

This PR changes the scheduler and model runner so that the model runner gets the input token IDs from the scheduler. This change is especially useful when the token IDs...

ready

[V1][Help Wanted] Porting missing sampling parameters to V1

### Anything you want to discuss about vllm. To switch the engine from V0 to V1, we need to comprehensively support the sampling parameters in https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py While most of the...

good first issue

misc

ci/build

[V1][Spec decode] Move drafter to model runner

Should be merged after #13339

Woosuk Kwon