lm-format-enforcer
lm-format-enforcer copied to clipboard
[Bug] Failing to output non-EN
Hey! Thank you for the nice tool and integrations. I've been trying this out with English JSON parsing using vllm, and it works great!
However, when I tried with a JP model (like the recently released aya from Cohere and llama3 fine tunes), I received cut off outputs.
result = json.loads(result)
Failed parsing output: {
"Input": "ミ
Do you perhaps know why it's occurring? My initial guess after looking at the repo was that it's not able to build a character tree due to these unicode characters and early stopping.
I checked the other Issues, and they are having issues with the key being non-EN. In this case, it's the content itself. I've tried it models without lm-format-enhancer on, and it seems to output ok without cutoff early on (though it can't output JSON consistently as expected).
Env: vllm==0.4.1 lm-format-enforcer==0.9.8
Hi! Can you please share the model+schema+prompt that you are trying to use? If this reproduces on a 7B (or less) model it will be much easier to debug.
The formatter seems unable to proceed a generation every time it generates a Roman number.
I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.
Here is an example:
{"实体1": "三体系列", "实体2": "三体Ⅱ
and the generation just stops even the schema hasn't finished yet.
This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.
Got this exact problem. Any solution or workaround? I'm using Llama-3-8b-instruct and the HF transformers lib to do generation.
The formatter seems unable to proceed a generation every time it generates a Roman number.
I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.
Here is an example:
{"实体1": "三体系列", "实体2": "三体Ⅱ
and the generation just stops even the schema hasn't finished yet.
This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.
The formatter seems unable to proceed a generation every time it generates a Roman number.
I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.
Here is an example:
{"实体1": "三体系列", "实体2": "三体Ⅱ
and the generation just stops even the schema hasn't finished yet.
This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.
i got the same problem. Any soultion?thinks
Just ran into this myself
@noamgat here is a minimal example using guided decoding in vllm with LMFE v0.10.6
import os
from openai import OpenAI
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
messages = [{"role": "user", "content": "Find the definite integral of f(x)=x^2 from x=1 to x=3."}]
chat_completion = client.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
messages=messages,
temperature=0.0,
stream=True,
extra_body={
"guided_json": {
"type": "object",
"properties": {
"explanation": {
"description": "make sure to use mathematical notation in your explanation",
"type": "string"
}
},
"required": ["explanation"]
}
}
)
for chunk in chat_completion:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content)
print()
Which outputs
{
"
e
xp
lan
ation
":
"
The
definite
integral
of
a
function
f
(x
)
from
x
=a
to
x
=b
is
den
oted
as
∫
Thanks for the reproduction, this is something I hope to tackle in the next major version.
Any news? I have the exact same problem with characters such as ① or ②. When looking at the raw output, we can see it provides a truncated JSON which stops when it encounters this kind of characters:
[...] provided as follows. ①