lmql icon indicating copy to clipboard operation
lmql copied to clipboard

Add support for logit_bias for OpenAI Chat models / support for cl100k_base tokenizer

Open kharvd opened this issue 1 year ago • 2 comments

Currently, lmql doesn't support logit bias-based constraints for openai/gpt-3.5-turbo and openai/gpt-4. However, the OpenAI documentation states that logit_bias is indeed supported by the API. The reason why it currently doesn't work might have something to do with the fact that these models use a different tokenizer, cl100k_base, which doesn't seem to be well documented. For example, for the "list of things not to forget when going to the sea" example, here's the current prompt generated by lmql:

{
  "model": "gpt-4",
  "max_tokens": 32,
  "temperature": 0.8,
  "user": "lmql",
  "stream": true,
  "logit_bias": {
    "16012": 100,
    "16598": 100,
    "24541": 100
  },
  "messages": [
    {
      "role": "user",
      "content": "A list of things not to forget when going to the sea (not travelling): \n- Sunglasses \n-  Ur \n- "
    }
  ]
}

However, tokens 16012, 16598 and 24541 in cl100k_base are "Urls", " Crusher" and ".getType", which explains why the model generates a completion "Ur". The solution might be to use the https://github.com/openai/tiktoken library:

>>> import tiktoken
>>> enc = tiktoken.get_encoding("cl100k_base")
>>> enc.encode("Volleyball")
[53, 35619, 4047]
>>> enc.encode("Sunscreen")
[31192, 8337]
>>> enc.encode("Bathing Suite")
[33, 44661, 21652]

And indeed using logit biases

{
    "53": 100,
    "31192": 100,
    "33": 100
}

I'm getting {"content":"Sun"}

kharvd avatar Apr 16 '23 18:04 kharvd

Wow thanks, this is amazing news. Thanks for pointing it out. AFAIK this has not always been the case, especially early on the API was very limited, but we should definitely implement support now.

Thanks for the pointers. We already considered implementing support for tiktoken. I will have a closer look soon. In case you want to dive in, https://github.com/eth-sri/lmql/blob/main/src/lmql/runtime/bopenai/openai_api.py is where we implement the Chat API interface.

lbeurerkellner avatar Apr 16 '23 19:04 lbeurerkellner

I tried adding the tiktoken tokenizer (see #25), but looks like something other than that is also wrong for chat models (see screenshot in the PR). Let me know if you have any ideas!

kharvd avatar Apr 16 '23 23:04 kharvd