Nick Hill comments

Results 114 comments of


                                            Nick Hill

[Core] Add MultiprocessingGPUExecutor

> I think `--no-worker-use-ray` is bad. I suggest something like `--distributed-executor-backend`, which can be either `ray` or `mp` , and we might have more in the future. @youkaichao that sounds...

[Core] Add MultiprocessingGPUExecutor

> This is really great work @njhill. Thanks for all the effort! Will this change also enable ray to become an optional dependency? Yes, although ray is already optional if...

[Core] Add MultiprocessingGPUExecutor

@youkaichao any idea why the ray distributed CI test might be [failing](https://buildkite.com/vllm/ci/builds/6310#018f3614-2885-41ce-bc06-67a5a22fdf80) now due to a gloo timeout? I think it's something to do with a second engine using TP...

[Core] Add MultiprocessingGPUExecutor

@youkaichao fyi the problem is still there after pulling in your latest fix commit, I'll try to narrow it down tomorrow.

[Frontend] Refactor prompt parsing

@DarkLight1337 sorry for the hold-up, I will hopefully get to this tomorrow including looking at reconciling with #3512.

[Usage]: vllm can host offline? with internet connection?

@juud79 we're working on https://github.com/vllm-project/vllm/pull/3125 to address this. You can work around this by passing the explicit path to the model in your local HF cache as the model name.

[Misc] Some minor simplifications to detokenization logic

@Yard1 the other thing I thought would make sense would be to move the `detokenize_incrementally` and `convert_prompt_ids_to_tokens` functions from `tokenizer.py` to `detokenizer.py`. Didn't include that in this PR yet though...

[Misc] Some minor simplifications to detokenization logic

Thanks @Yard1! Have now pushed another commit moving those functions.

[Misc] Some minor simplifications to detokenization logic

@Yard1 good to merge?

[Misc] Some minor simplifications to detokenization logic

Going to merge now that I have the power, since this was already approved.