Aaron Pham comments

Results 579 comments of


                                            Aaron Pham

[Chore] Ignore Ruff warning on E501

> Eventually I hope to switch from yapf to black (run with ruff), which has a line length of 88. Generally increasing line length is a one way move so...

[Chore] Ignore Ruff warning on E501

But at a certain point where we touch model_executor or v1, it is bound to disrupt everyone PR right? fwiw I think we can add the config and run this...

[Chore] Ignore Ruff warning on E501

I'm ok with manually adding noqa E501, just that it will contain a lot of these

[V1] Structured Outputs + Thinking compatibility

cc @gaocegege might be interested

[V1] Structured Outputs + Thinking compatibility

> I was testing the modifications using `vllm serve` and had a quick question. This should now be addressed. There was a bad fix somewhere then

[V1] Structured Outputs + Thinking compatibility

We need to use the tokenizer for the reasoning parser. But if u like i can refactor out the tokenizer change first then rebase this change on top of that...

[V1] Structured Outputs + Thinking compatibility

seems like the failure on entrypoint is not related 😿

[Bug]: `guided_regex` not working on M2 Ultra VLLM

fyi https://github.com/vllm-project/vllm/pull/18359 but structured outputs in CPU has limited suppport.

[Bug]: `guided_regex` not working on M2 Ultra VLLM

vllm doesn't have support with mlx, so not sure it is even being used there.