lm-format-enforcer icon indicating copy to clipboard operation
lm-format-enforcer copied to clipboard

[Bug] Failing to output non-EN

Open NanoCode012 opened this issue 1 year ago • 7 comments
trafficstars

Hey! Thank you for the nice tool and integrations. I've been trying this out with English JSON parsing using vllm, and it works great!

However, when I tried with a JP model (like the recently released aya from Cohere and llama3 fine tunes), I received cut off outputs.

result = json.loads(result)

Failed parsing output: {
"Input": "ミ

Do you perhaps know why it's occurring? My initial guess after looking at the repo was that it's not able to build a character tree due to these unicode characters and early stopping.

I checked the other Issues, and they are having issues with the key being non-EN. In this case, it's the content itself. I've tried it models without lm-format-enhancer on, and it seems to output ok without cutoff early on (though it can't output JSON consistently as expected).

Env: vllm==0.4.1 lm-format-enforcer==0.9.8

NanoCode012 avatar May 25 '24 16:05 NanoCode012

Hi! Can you please share the model+schema+prompt that you are trying to use? If this reproduces on a 7B (or less) model it will be much easier to debug.

noamgat avatar May 31 '24 06:05 noamgat

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

rdlwicked avatar May 31 '24 14:05 rdlwicked

Got this exact problem. Any solution or workaround? I'm using Llama-3-8b-instruct and the HF transformers lib to do generation.

liqul avatar Jul 02 '24 05:07 liqul

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

i got the same problem. Any soultion?thinks

ericperfect avatar Aug 15 '24 10:08 ericperfect

Just ran into this myself

jamestwhedbee avatar Aug 16 '24 14:08 jamestwhedbee

@noamgat here is a minimal example using guided decoding in vllm with LMFE v0.10.6

import os
from openai import OpenAI


openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

messages = [{"role": "user", "content": "Find the definite integral of f(x)=x^2 from x=1 to x=3."}]
chat_completion = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=messages,
    temperature=0.0,
    stream=True,
    extra_body={
      "guided_json": {
         "type": "object",
         "properties": {
            "explanation": {
              "description": "make sure to use mathematical notation in your explanation",
              "type": "string"
            }
         },
         "required": ["explanation"]
      }
    }
)
for chunk in chat_completion:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content)

print()

Which outputs

{
 
 

"
e
xp
lan
ation
":
 "
The
 definite
 integral
 of
 a
 function
 f
(x
)
 from
 x
=a
 to
 x
=b
 is
 den
oted
 as
 ∫
 


jamestwhedbee avatar Aug 16 '24 14:08 jamestwhedbee

Thanks for the reproduction, this is something I hope to tackle in the next major version.

noamgat avatar Sep 03 '24 19:09 noamgat

Any news? I have the exact same problem with characters such as ① or ②. When looking at the raw output, we can see it provides a truncated JSON which stops when it encounters this kind of characters:

[...] provided as follows. ①

kinoute avatar Aug 19 '25 17:08 kinoute