vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[Bug]: `guided_regex` not working on M2 Ultra VLLM

Open NilsHellwig opened this issue 5 months ago • 3 comments

Your current environment

Python 3.12, used conda and official installation guide.

Happens regardless of the employed model.

🐛 Describe the bug

The output of `python -m vllm.entrypoints.openai.api_server --trust-remote-code --model google/gemma-3-4b-it --port 27090 --max-model-len 10000 --api-key token-abc123`

In Console:

WARNING 06-16 09:21:02 [__init__.py:34] xgrammar module cannot be imported successfully. Falling back to use outlines instead.

Happens regardless of the employed model.

Before submitting a new issue...

  • [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

NilsHellwig avatar Jun 16 '25 07:06 NilsHellwig

fyi https://github.com/vllm-project/vllm/pull/18359

but structured outputs in CPU has limited suppport.

aarnphm avatar Jun 19 '25 08:06 aarnphm

But I'm using the Mac's GPU

NilsHellwig avatar Jun 19 '25 08:06 NilsHellwig

vllm doesn't have support with mlx, so not sure it is even being used there.

aarnphm avatar Jun 19 '25 18:06 aarnphm