jsonformer icon indicating copy to clipboard operation
jsonformer copied to clipboard

Add a non-token approach for OpenAI

Open martinezpl opened this issue 1 year ago • 9 comments

Hello, as per interest shown in https://github.com/1rgs/jsonformer/issues/2 I'd like to propose a variation for OpenAI models. This makes it possible to use OpenAI models with jsonformer without changing existing code.

Summary

  • Added possibility of filling the JSONs through calls to a non-chat OpenAI model (seems to work best with curie)
  • Two new classes: OpenAIModel and JsonformerNoTokens, which is basically the stripped down original with no tokenizer
  • Instead of using logits in generate_boolean and generate_array, we're getting the next most likely token from logprobs.
  • As a substitute for stopping criteria, stop word ", is used to limit the model's response to a single value

Why no chat?

I found the chat model to be more querulous (As an AI model I cannot blablabla...), prompt-dependant and slow. Solution proposed here seems to be working best with text-curie-001 as it's super fast and cheap.

Perhaps somebody can figure out an effective way to utilize the chat model, but I cannot see any other way than prompting to generate the whole JSON at once, which is completely in opposition to the concept of this project.

Why no tokens?

I spent some time trying to continue operating on tokens while using the API, but I encountered two issues:

  • tokenization available on tiktoken does not seem to provide word boundaries, therefore, for example, encoding "colors" will give us two separate tokens, which after decoding give "col ors". Can't work like that.
  • chat models do not accept tokens as input

And of course, because the models run remotely we have no access to the generation process. From my perspective, this is the reason that renders all token operations pointless here. I still left the tokenizer class just in case.

How to run it?

Make sure you have OPENAI_API_KEY env var.

poetry install
poetry run python tests/test_openai.py

You'll see the JSON being filled. You can change the used model and its temperature in that file during JsonformerNoTokens initialisation.

martinezpl avatar May 06 '23 02:05 martinezpl

I am just pondering that, as far as I know, ChatModel is an inborn good JSON generator. GPT-3.5-turbo is derived from code-davinci-003 (Codex), which is fine-tuned on a large number of codes and really capable of generating JSON.

zhaochenyang20 avatar May 06 '23 11:05 zhaochenyang20

I do have a deep interset in generating JSON format from OpenAI models. Please feel free to contact me!

zhaochenyang20 avatar May 06 '23 11:05 zhaochenyang20

@zhaochenyang20 I agree it is fairly good at it, but the authors of this project don't seem to be convinced it's enough to have a good prompt for the chat. I assume it's based on their experiences, personally I don't know.

martinezpl avatar May 06 '23 13:05 martinezpl

Very Excited to try this out!!!

Void-n-Null avatar May 16 '23 17:05 Void-n-Null

Looking forward to using this to parse plain text outputs from CoCa into JSON for image JSON captioning.

Update: Getting really good results so far using text-davinci-003

moro-n0-kimi avatar May 19 '23 21:05 moro-n0-kimi

Awesome to hear that @moro-no-kimi 🥳

martinezpl avatar May 22 '23 09:05 martinezpl

@moro-no-kimi

Are you using jsonformer with the open ai model? If yes, is it possible to share the code?

tv-ankur avatar May 24 '23 12:05 tv-ankur

This is kind of obsolete now with function calls feature from OpenAI

martinezpl avatar Jul 03 '23 16:07 martinezpl

This is great work, but would complicate the repo, which is nice and simple

On this list there are quite a few others that support api only models https://github.com/wassname/awesome-interpretability/tree/main?tab=readme-ov-file#structured-output

wassname avatar May 10 '24 12:05 wassname