humanify icon indicating copy to clipboard operation
humanify copied to clipboard

feat: Provides support for OpenAI compatible APIs that support json_o…

Open largomst opened this issue 1 year ago • 7 comments

Introduce @instructor-ai/instructor to add support for OpenAI-compatible APIs that support json_object.

largomst avatar Dec 03 '24 09:12 largomst

How does this exactly differ from using OpenAI's library? You can already change the base url

jehna avatar Dec 03 '24 15:12 jehna

Right, this would use json_object instead of json_schema. Afaik json_object results in much more mistakes in json response result, causing the need to use Zod to parse the result (while json_schema guarantees that the response matches schema).

Using this approach would need much better error handling than what's implemented now (not a bad thing per se) and result in retries, which would mean it costs more to run

jehna avatar Dec 03 '24 15:12 jehna

` (base) PS D:\local\h2\humanify-main> pnpm start openai --model="deepseek-chat" --apiKey="sk-454****" --baseURL="https://api.deepseek.com/v1" "D:\\\****\VM7477refixed.js"

[email protected] start D:\local\h2\humanify-main tsx src/index.ts "openai" "--model=deepseek-chat" "--apiKey=sk-********" "--baseURL=https://api.deepseek.com/v1" "D:\\\\\\\\VM7477refixed.js"

(node:21436) [DEP0040] DeprecationWarning: The punycode module is deprecated. Please use a userland alternative instead. (Use node --trace-deprecation ... to show where the warning was created) Processing file 1/1 Processing: 1%[Instructor:ERROR] 2025-01-21T00:17:43.451Z: response model: Result - Validation issues: Validation error: Required at "newName" file:///D:/local/h2/humanify-main/node_modules/.pnpm/[email protected]/node_modules/zod/lib/index.mjs:587 const error = new ZodError(ctx.common.issues); ^

ZodError: [ { "code": "invalid_type", "expected": "string", "received": "undefined", "path": [ "newName" ], "message": "Required" } ] at get error (file:///D:/local/h2/humanify-main/node_modules/.pnpm/[email protected]/node_modules/zod/lib/index.mjs:587:31) at m (D:\local\h2\humanify-main\node_modules.pnpm@[email protected][email protected][email protected][email protected]\node_modules@instructor-ai\instructor\src\instructor.ts:227:56)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at (D:\local\h2\humanify-main\src\plugins\openai\openai-rename.ts:31:24) at visitAllIdentifiers (D:\local\h2\humanify-main\src\plugins\local-llm-rename\visit-all-identifiers.ts:42:21) at (D:\local\h2\humanify-main\src\plugins\openai\openai-rename.ts:25:12) at unminify (D:\local\h2\humanify-main\src\unminify.ts:26:27) at Command. (D:\local\h2\humanify-main\src\commands\openai.ts:40:5) { issues: [ { code: 'invalid_type', expected: 'string', received: 'undefined', path: [ 'newName' ], message: 'Required' } ], addIssue: [Function (anonymous)], addIssues: [Function (anonymous)], errors: [ { code: 'invalid_type', expected: 'string', received: 'undefined', path: [ 'newName' ], message: 'Required' } ] }

Node.js v22.5.1  ELIFECYCLE  Command failed with exit code 1.`

MohammedInTheCloud avatar Jan 21 '25 01:01 MohammedInTheCloud

@MohammedInTheCloud ?

jehna avatar Jan 21 '25 04:01 jehna

I encountered an error when trying to use @largomst's solution with the DeepSeek API. The error occurs because the API response doesn't include the required "newName" field, causing the Zod validation to fail.

Here's the error I got when running the command:

pnpm start openai --model="deepseek-chat" --apiKey="sk-454****" --baseURL="https://api.deepseek.com/v1" "D:\**\**\****\VM7477refixed.js"

The key error message is:

[Instructor:ERROR] Validation error: Required at "newName"
ZodError: [ { "code": "invalid_type", "expected": "string", "received": "undefined", "path": [ "newName" ], "message": "Required" } ]

claude implemented several fixes to handle this issue:

  1. Added try/catch blocks around the API call
  2. Added null/undefined checking for the result
  3. Implemented fallback to return the original name if renaming fails
  4. Enhanced the system prompt for clearer response format requirements
  5. Added temperature parameter (0.2) for more consistent naming
  6. Improved logging for better debugging
import OpenAI from "openai";
import { visitAllIdentifiers } from "../local-llm-rename/visit-all-identifiers.js";
import { showPercentage } from "../../progress.js";
import { verbose } from "../../verbose.js";
import Instructor from "@instructor-ai/instructor";
import { z } from 'zod';

export function openaiRename({
  apiKey,
  baseURL,
  model,
  contextWindowSize
}: {
  apiKey: string;
  baseURL: string;
  model: string;
  contextWindowSize: number;
}) {
  const oai = new OpenAI({ apiKey, baseURL });
  const instructor = Instructor({
    client: oai,
    mode: "JSON",
  });

  return async (code: string): Promise<string> => {
    return await visitAllIdentifiers(
      code,
      async (name, surroundingCode) => {
        verbose.log(`Renaming ${name}`);
        verbose.log("Context: ", surroundingCode);

        try {
          const result = await instructor.chat.completions.create(
            toRenamePrompt(name, surroundingCode, model)
          );

          if (!result?.newName) {
            verbose.log(`Warning: Invalid response format for ${name}, keeping original name`);
            return name;
          }

          verbose.log(`Renamed to ${result.newName}`);
          return result.newName;
        } catch (error) {
          verbose.log(`Error renaming ${name}:`, error);
          return name; // Fallback to original name on error
        }
      },
      contextWindowSize,
      showPercentage
    );
  };
}

function toRenamePrompt(
  name: string,
  surroundingCode: string,
  model: string
) {
  const schema = z.object({
    newName: z.string({
      description: `The new name for the variable/function called \`${name}\``
    })
  });

  return {
    model,
    messages: [
      {
        role: "system",
        content: `You are a code reviewer tasked with improving code readability. Given a JavaScript/TypeScript identifier (variable or function name) and its surrounding code context, suggest a more descriptive name that reflects its purpose and usage. The current name is \`${name}\`. Respond with a JSON object containing the "newName" property.`
      },
      {
        role: "user",
        content: surroundingCode
      }
    ],
    response_model: {
      name: "Result",
      schema: schema
    },
    temperature: 0.2 // Lower temperature for more consistent naming
  };
}

MohammedInTheCloud avatar Jan 21 '25 10:01 MohammedInTheCloud

Successfully tested the tool on a large codebase (~30k LOC). It effectively deobfuscated the code using DeepSeek. Cost me about $0.60, and the result expanded to about 85k lines of code. Thanks @largomst @jehna

now i can rewrite this code, thanks to you @jehna !

MohammedInTheCloud avatar Jan 21 '25 14:01 MohammedInTheCloud

I get UnprocessableEntityError: 422 Failed to deserialize the JSON body into the target type: response_format: response_format.type json_schema is unavailable now at line 33 column 1

How to fix this problem? @largomst @jehna @MohammedInTheCloud @ocxo @0xdevalias

diolegend avatar Jun 08 '25 14:06 diolegend