lmql
lmql copied to clipboard
Add support for logit_bias for OpenAI Chat models / support for cl100k_base tokenizer
Currently, lmql doesn't support logit bias-based constraints for openai/gpt-3.5-turbo
and openai/gpt-4
. However, the OpenAI documentation states that logit_bias
is indeed supported by the API. The reason why it currently doesn't work might have something to do with the fact that these models use a different tokenizer, cl100k_base
, which doesn't seem to be well documented. For example, for the "list of things not to forget when going to the sea" example, here's the current prompt generated by lmql:
{
"model": "gpt-4",
"max_tokens": 32,
"temperature": 0.8,
"user": "lmql",
"stream": true,
"logit_bias": {
"16012": 100,
"16598": 100,
"24541": 100
},
"messages": [
{
"role": "user",
"content": "A list of things not to forget when going to the sea (not travelling): \n- Sunglasses \n- Ur \n- "
}
]
}
However, tokens 16012
, 16598
and 24541
in cl100k_base
are "Urls"
, " Crusher"
and ".getType"
, which explains why the model generates a completion "Ur"
. The solution might be to use the https://github.com/openai/tiktoken library:
>>> import tiktoken
>>> enc = tiktoken.get_encoding("cl100k_base")
>>> enc.encode("Volleyball")
[53, 35619, 4047]
>>> enc.encode("Sunscreen")
[31192, 8337]
>>> enc.encode("Bathing Suite")
[33, 44661, 21652]
And indeed using logit biases
{
"53": 100,
"31192": 100,
"33": 100
}
I'm getting {"content":"Sun"}
Wow thanks, this is amazing news. Thanks for pointing it out. AFAIK this has not always been the case, especially early on the API was very limited, but we should definitely implement support now.
Thanks for the pointers. We already considered implementing support for tiktoken. I will have a closer look soon. In case you want to dive in, https://github.com/eth-sri/lmql/blob/main/src/lmql/runtime/bopenai/openai_api.py is where we implement the Chat API interface.
I tried adding the tiktoken tokenizer (see #25), but looks like something other than that is also wrong for chat models (see screenshot in the PR). Let me know if you have any ideas!