vllm [Bug]: `guided_regex` not working on M2 Ultra VLLM

[Bug]: `guided_regex` not working on M2 Ultra VLLM

Open NilsHellwig opened this issue 5 months ago • 3 comments

Your current environment

Python 3.12, used conda and official installation guide.

Happens regardless of the employed model.

🐛 Describe the bug

The output of `python -m vllm.entrypoints.openai.api_server --trust-remote-code --model google/gemma-3-4b-it --port 27090 --max-model-len 10000 --api-key token-abc123`

In Console:

WARNING 06-16 09:21:02 [__init__.py:34] xgrammar module cannot be imported successfully. Falling back to use outlines instead.

Happens regardless of the employed model.

Before submitting a new issue...

[x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Jun 16 '25 07:06 NilsHellwig

fyi https://github.com/vllm-project/vllm/pull/18359

but structured outputs in CPU has limited suppport.

Jun 19 '25 08:06 aarnphm

But I'm using the Mac's GPU

Jun 19 '25 08:06 NilsHellwig

vllm doesn't have support with mlx, so not sure it is even being used there.

Jun 19 '25 18:06 aarnphm

vllm vllm copied to clipboard

[Bug]: `guided_regex` not working on M2 Ultra VLLM

Your current environment

🐛 Describe the bug

Before submitting a new issue...

vllm
vllm copied to clipboard