eliza
eliza copied to clipboard
LLM can't be trusted to parse it's own json
Describe the bug
We trust the LLM to parse it's own JSON resulting in what a separate issue referred to as an infinite loop (which technically will resolve itself if left alone to smash on the OpenAI endpoint for long enough)
# Instructions: Write the next message for lina. Include an action, if appropriate. Possible response actions: MUTE_ROOM, ASK_CLAUDE, NONE, IGNORE
Response format should be formatted in a JSON block like this:
json
{ "user": "lina", "text": string, "action": string }
Message is json
{ "user": "lina", "text": "Oh honey~ Working with a pioneer sounds tantalizing... but only if he can keep up with me and my fiery spirit 😉 Now spill the details or I might get bored!", "action": NONE }
response is json
{ "user": "lina", "text": "Oh honey~ Working with a pioneer sounds tantalizing... but only if he can keep up with me and my fiery spirit 😉 Now spill the details or I might get bored!", "action": NONE }
parsedContent is null
parsedContent is null, retrying
Notice above that the action: value NONE is not a string. Now take a look at the correctly parsed JSON immediately following this:
parsedContent is {
user: 'lina',
text: "Oh darling st4rgard3n~ I'm always up for a little blockchain banter or maybe some spicy discussions about funding public goods... but don't think I won't call you out if you get all serious on me.<br> So what's the plan with @mattyryze?",
action: 'NONE'
}
Here the LLM has correctly formatted NONE as 'NONE' a correct string.
To Reproduce
Just run eliza with a cheap llm model long enough and you will definitely encounter this one.
Expected behavior
The message returned from the LLM should then be formatted into JSON in the program.
This issue https://github.com/ai16z/eliza/issues/70 is not accurate but it's a duplicate of this issue now.
several python libs solve/attempt to solve this, in order of my personal opinion of them: -outlines -instructor -lmql -guidance
probably more -- however, not sure if any have a typescript equivalent
if it's openai, we can use structured output mode: https://platform.openai.com/docs/guides/structured-outputs
kind of a hacky workaround for non-openai models: run the model through a LiteLLM proxy server: https://github.com/BerriAI/litellm
https://docs.litellm.ai/docs/completion/json_mode -- it's called json mode, but i think you can do any kind of structured output. Just replace the OPENAI_API_URL with localhost:4000 and should be compatible
This could help with the issue:
function parseLLMJson<T>(rawResponse: string): T {
// Sanitize JSON while preserving native types
const sanitizedJson = rawResponse.replace(
/(\w+):\s*([^,}\s]+)/g,
(match, key, value) => {
// Don't quote if it's a number
if (/^-?\d+(\.\d+)?$/.test(value)) {
return `"${key}": ${value}`;
}
// Don't quote if it's a boolean
if (value === 'true' || value === 'false') {
return `"${key}": ${value}`;
}
// Don't quote if it's already properly quoted
if (/^["'].*["']$/.test(value)) {
return `"${key}": ${value.replace(/^['"](.*)['"]$/, '"$1"')}`;
}
// Quote everything else
return `"${key}": "${value}"`;
}
);
try {
return JSON.parse(sanitizedJson) as T;
} catch (error) {
console.error('Failed to parse JSON:', error);
throw new Error('Invalid JSON format');
}
}
@St4rgarden I wonder if simply explaining it better in instructions would solve it like
Possible response actions: MUTE_ROOM, ASK_CLAUDE, NONE, IGNORE
Response format should be formatted in a JSON block like this:
json
{ "user": "lina", "text": string, "action": string }
example
{ "user": "lina", "text": "sometext", "action": "ASK_CLAUDE"}
yep. hi @Elyx0 :)
Yeah I had a similar question about the current approach for generateObject in packages/core/generation.ts. It looks like we're using a workaround instead of the { generateObject } method from "ai", which natively supports Z objects and ensures typing. This could be more reliable than the current method of using generateText to generate, parse, and retry until we get the desired output.
Using { generateObject } would allow us to eliminate the custom generateObject and generateObjectArray functions, simplifying the code and leveraging the AI SDK's structured output capabilities. Here’s the code as it stands now:
export async function generateObject({
runtime,
context,
modelClass,
}: {
runtime: IAgentRuntime;
context: string;
modelClass: string;
}): Promise<any> {
if (!context) {
elizaLogger.error("generateObject context is empty");
return null;
}
let retryDelay = 1000;
while (true) {
try {
const response = await generateText({
runtime,
context,
modelClass,
});
const parsedResponse = parseJSONObjectFromText(response);
if (parsedResponse) {
return parsedResponse;
}
} catch (error) {
elizaLogger.error("Error in generateObject:", error);
}
await new Promise((resolve) => setTimeout(resolve, retryDelay));
retryDelay *= 2;
}
}
My proposal is to replace it with the generateObject function provided in the AI SDK, as described below:
/**
Generate JSON with any schema for a given prompt using a language model.
This function does not stream the output. If you want to stream the output, use `streamObject` instead.
@returns
A result object that contains the generated object, the finish reason, the token usage, and additional information.
*/
declare function generateObject(options: Omit<CallSettings, 'stopSequences'> & Prompt & {
output: 'no-schema';
model: LanguageModel;
mode?: 'json';
experimental_telemetry?: TelemetrySettings;
experimental_providerMetadata?: ProviderMetadata;
_internal?: {
generateId?: () => string;
currentDate?: () => Date;
};
}): Promise<GenerateObjectResult<JSONValue>>;
Switching to this method would improve reliability and reduce custom parsing logic. I'd be interested to hear your thoughts!