jsonformer
jsonformer copied to clipboard
Add a non-token approach for OpenAI
Hello, as per interest shown in https://github.com/1rgs/jsonformer/issues/2 I'd like to propose a variation for OpenAI models. This makes it possible to use OpenAI models with jsonformer without changing existing code.
Summary
- Added possibility of filling the JSONs through calls to a non-chat OpenAI model (seems to work best with curie)
- Two new classes: OpenAIModel and JsonformerNoTokens, which is basically the stripped down original with no tokenizer
- Instead of using logits in
generate_boolean
andgenerate_array
, we're getting the next most likely token fromlogprobs
. - As a substitute for stopping criteria, stop word
",
is used to limit the model's response to a single value
Why no chat?
I found the chat model to be more querulous (As an AI model I cannot blablabla...), prompt-dependant and slow.
Solution proposed here seems to be working best with text-curie-001
as it's super fast and cheap.
Perhaps somebody can figure out an effective way to utilize the chat model, but I cannot see any other way than prompting to generate the whole JSON at once, which is completely in opposition to the concept of this project.
Why no tokens?
I spent some time trying to continue operating on tokens while using the API, but I encountered two issues:
- tokenization available on
tiktoken
does not seem to provide word boundaries, therefore, for example, encoding "colors" will give us two separate tokens, which after decoding give "col ors". Can't work like that. - chat models do not accept tokens as input
And of course, because the models run remotely we have no access to the generation process. From my perspective, this is the reason that renders all token operations pointless here. I still left the tokenizer class just in case.
How to run it?
Make sure you have OPENAI_API_KEY
env var.
poetry install
poetry run python tests/test_openai.py
You'll see the JSON being filled. You can change the used model and its temperature in that file during JsonformerNoTokens
initialisation.
I am just pondering that, as far as I know, ChatModel is an inborn good JSON generator. GPT-3.5-turbo is derived from code-davinci-003 (Codex), which is fine-tuned on a large number of codes and really capable of generating JSON.
I do have a deep interset in generating JSON format from OpenAI models. Please feel free to contact me!
@zhaochenyang20 I agree it is fairly good at it, but the authors of this project don't seem to be convinced it's enough to have a good prompt for the chat. I assume it's based on their experiences, personally I don't know.
Very Excited to try this out!!!
Looking forward to using this to parse plain text outputs from CoCa into JSON for image JSON captioning.
Update: Getting really good results so far using text-davinci-003
Awesome to hear that @moro-no-kimi 🥳
@moro-no-kimi
Are you using jsonformer with the open ai model? If yes, is it possible to share the code?
This is kind of obsolete now with function calls feature from OpenAI
This is great work, but would complicate the repo, which is nice and simple
On this list there are quite a few others that support api only models https://github.com/wassname/awesome-interpretability/tree/main?tab=readme-ov-file#structured-output