sublayer Single string output adapter occasionally comes back empty or generic

Looks like due to the nature of the way we're using function calling to get structured outputs back, some models occasionally return empty strings for the parameters or just return the description/prompt they were given.

A couple initial thoughts:

Could use a couple examples of generators that expose the issue
What are some automated tests that could help us catch things like this in the future?
Curious to see if this issue happens in output adapters that are more complex, for example: #12 #13 #14
The python library Instructor uses a similar technique to us by inverting the way function calling is done (how do they solve it? how are their prompts different from ours?)
It seems like there are probably a few different ways to solve this, really curious to play with them and see what they feel like...

Jul 17 '24 15:07 swerner

few more notes on this for posterity: the tutorial demo of historical event finder has the following issues with the models below:

gpt 4 50/50 gpt 4o 👎

when renaming the function name in the api calls to "formatter" or "format_response" 4o behaves very well. BUT blueprints begins to behave badly.

when renaming the function to something generic like "response" or "function". Blueprints continues to work well. but historical event finder gets even worse.

local: llama3:8b does not play well with event finder either (this was done through xml) llama3.1:8b DOES play well with event finder.

^both work with blueprints

Jul 26 '24 20:07 AndrewBKang

Hmm if you have the llama models set up, want to see how it plays with the list_of_strings output adapter? I'm curious if the single string output adapter might just be too simple, but if we have a more complex data type it performs better...

Jul 26 '24 20:07 swerner

sorry took a bit to rebase it had some conflicts

for historical event finder / llama3:8b: {"error"=>"llama3 does not support tools"} bah humbug

for historical event finder / llama3.1:8b: ["First Landing by Vikings in North America", "Independence Day of Chile", "Death of Joseph Stalin"] (just change historical event finder to 3 list long)

Jul 26 '24 21:07 AndrewBKang

I could revert back to the xml approach to test this out if we would like to see how this would effect it!

Jul 26 '24 21:07 AndrewBKang

nah it seems like everyone is converging on this json spec version of tool calling so I think its fine, was more curious to see if that theory of more complex data types could be a fruitful path...

When we have the universal json spec formatter, we could change single string into something like the parameter itself and then an additional ignored parameter like "explanation" or "notes" or something...which may also end up increasing the quality of the output anyway..

Jul 26 '24 21:07 swerner