OpenAI structured outputs support
Feature Request
I've been working with typia.llm.schema for a while and it has been extremely helpful in generating JSON schemas to call LLMs from TS types. However, the new structured outputs API of OpenAI has some limitations in the type of schemas it can take.
In particular nullable is not been taken into account. So it'd be great if we could map types X | null to anyOf. Maybe introducing a new flag to the typia.llm.schema function.
Also, for types that don't extend from Record, we should mark [additionalProperties to false](https://platform.openai.com/docs/guides/structured-outputs/additionalproperties-false-must-always-be-set-in-objects).
I can contribute to this feature, but I may need some pointers for code references to start.
T | null type cannot be oneOf type, because it is the specification of JSON schema (of OpenAPI v3.0) that OpenAI has adopted. Writing T | null type as oneOf type, it is allowed since JSON schema 2020-12 draft version (of OpenAPI v3.1).
By the way, OpenAI understands only understands the anyOf type? Currently, @samchon/openapi and typia are utilizing oneOf type for the TypeScript union type case, because oneOf type has clear meaning than anyOf type.
Also, about the additionalProperties to be false, it should be a little bit careful.
The additionalProperties := false means that it does not allow any type of superfluous properties. In the validation rule, if there're any extra property that is not defined in the properties, it must be considered as invalid. It is the additionalProperties := false meaning.
Therefore, if you want to contribute to typia.llm.application<App>() and typia.llm.schema<T>() functions, you have to be careful about the rule.
Here is the code filling the ILlmSchema.IObject.additionalProperties property, and you can accomplish what you want just by changing the return type of the join() function from ILlmSchema | undefined to ILlmSchema | false.
https://github.com/samchon/typia/blob/8edeef57da6a121e52412da556074335dd93ef3a/src/programmers/internal/llm_schema_object.ts#L98-L121
I haven't found any info about if OpenAI supporting oneOf, they do mention they support anyOf, but I agree that oneOf should be right type, (doesn't make any sense for a type to be null and not null at the same time). That's why I was suggesting putting this behavior changes under a flag or making the user explicitly ask for that, because it deviates from OpenAPI & JSON schema standards.
For example, the OpenAPI SDK has the following unit tests: https://github.com/openai/openai-python/blob/aeaed488352274a9ca86c834eeb618d732989518/tests/lib/test_pydantic.py#L65
no references to oneOf unfortunately.
More info: https://community.openai.com/t/official-documentation-for-supported-schemas-for-response-format-parameter-in-calls-to-client-beta-chats-completions-parse/932422/4
How about the other models?
In the Google Gemini case, it is using the OpenAPI v3.0.3 specified JSON schema, but not supporting oneOf.
- https://ai.google.dev/gemini-api/docs/function-calling
- https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling#function-declarations
OpenAI, it sometimes looks like using OpenAPI v3.1, and sometimes v3.0. It supports mixed-in types embodied by type: ["string", "null"], but not supporting tuple type embodied by { type: "array", prefixItems: [A, B, C] }. I need to study and test OpenAI deeply at next weekend.
To support LLM function calling feature exactly, I should separate the providers like below.
- Top level namespaces
-
typia.openai.application<App>(): ILlmApplication<IOpenAiSchema> -
typia.gemini.application<App>(): ILlmApplication<IGeminiSchema> -
typia.llama.application<App>(): ILlmApplication<ILlamaSchema>
-
- Nested namespaces
-
typia.llm.openai.application<App>() -
typia.llm.gemini.application<App>() -
typia.llm.llama.application<App>()
-
- Generic Argument
-
typia.llm.application<App, "openai">() -
typia.llm.application<App, "gemini">() -
typia.llm.application<App, "llama">()
-
@antoniomdk If you send an PR about additionalProperties, I'll accept it.
Also, about the manipulating specific LLM provider's schema, I'll prepare the major update.
It would be @samchon/[email protected] and [email protected].
@samchon That sounds great! I think the LLM-specific separation makes a lot of sense. I'll send a PR for additionalProperties by EOW (probably during the weekend).
Related to LLM structured outputs, I find that when prompting I often want to use the jsdoc comment for a type in the prompt. Can typia add a misc method for returning the jsdoc string of a particular type.
Using typia.reflect.metadata can get you this information, but it'd be nice to just get the jsdoc comment.
@antoniomdk, @bradleat https://github.com/samchon/openapi/blob/v2.0/src/structures/IChatGptSchema.ts
I'm preparing the OpenAI dedicated schema type as IChatGptSchema in the next version of @samchon/openapi and typia.
Here is the type, and I'll test it by using the ChatGPT API, and considering below things.
- Whether to adapt
$reftype to every name schemas, or just only for the recursive types - Whether to just use
oneOftype and itsdiscriminatorproperty for clear union type predication - Whether to use
consttype orenumproperty- OpenAI's document supports JSON schema v7 specification (
OpenApi.IJsonSchema) - However, example of OpenAI shows that only using
anyOf - Also,
constis clear thatenum, but example is just utilizing theenum
- OpenAI's document supports JSON schema v7 specification (
If you want to experience it earlier, install typia@next version, and call the typia.llm.application<App, "chatgpt">().
npm install typia@next
Here is an example of the currently considering IChatSchema's use case.
Source Code
import {
ChatGptTypeChecker,
IChatGptSchema,
ILlmApplication,
} from "@samchon/openapi";
import typia, { tags } from "typia";
const app: ILlmApplication<"chatgpt"> = typia.llm.application<
BbsArticleController,
"chatgpt"
>({
separate: (schema: IChatGptSchema) =>
ChatGptTypeChecker.isString(schema) &&
schema.contentMediaType !== undefined,
});
console.log(app);
interface BbsArticleController {
/**
* Create a new article.
*
* Writes a new article and archives it into the DB.
*
* @param input Information of the article to create
* @returns Newly created article
*/
create(input: IBbsArticle.ICreate): Promise<IBbsArticle>;
/**
* Update an article.
*
* Updates an article with new content.
*
* @param id Target article's {@link IBbsArticle.id}
* @param input New content to update
*/
update(
id: string & tags.Format<"uuid">,
input: IBbsArticle.IUpdate,
): Promise<void>;
/**
* Erase an article.
*
* Erases an article from the DB.
*
* @param id Target article's {@link IBbsArticle.id}
*/
erase(id: string & tags.Format<"uuid">): Promise<void>;
}
/**
* Article entity.
*
* `IBbsArticle` is an entity representing an article in the BBS (Bulletin Board System).
*/
interface IBbsArticle extends IBbsArticle.ICreate {
/**
* Primary Key.
*/
id: string & tags.Format<"uuid">;
/**
* Creation time of the article.
*/
created_at: string & tags.Format<"date-time">;
/**
* Last updated time of the article.
*/
updated_at: string & tags.Format<"date-time">;
}
namespace IBbsArticle {
/**
* Information of the article to create.
*/
export interface ICreate {
/**
* Title of the article.
*
* Representative title of the article.
*/
title: string;
/**
* Content body.
*
* Content body of the article writtn in the markdown format.
*/
body: string;
/**
* Thumbnail image URI.
*
* Thumbnail image URI which can represent the article.
*
* If configured as `null`, it means that no thumbnail image in the article.
*/
thumbnail:
| null
| (string & tags.Format<"uri"> & tags.ContentMediaType<"image/*">);
}
/**
* Information of the article to update.
*
* Only the filled properties will be updated.
*/
export type IUpdate = Partial<ICreate>;
}
Compiled Code
import * as __typia_transform__llmApplicationFinalize from "typia/lib/internal/_llmApplicationFinalize.js";
import { ChatGptTypeChecker } from "@samchon/openapi";
import typia from "typia";
const app = (() => {
const app = {
model: "chatgpt",
functions: [
{
name: "create",
parameters: [
{
$ref: "#/$defs/IBbsArticle.ICreate",
description: "Information of the article to create",
$defs: {
"IBbsArticle.ICreate": {
type: "object",
properties: {
title: {
type: "string",
title: "Title of the article",
description:
"Title of the article.\n\nRepresentative title of the article.",
},
body: {
type: "string",
title: "Content body",
description:
"Content body.\n\nContent body of the article writtn in the markdown format.",
},
thumbnail: {
oneOf: [
{
type: "null",
},
{
type: "string",
format: "uri",
contentMediaType: "image/*",
},
],
title: "Thumbnail image URI",
description:
"Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
},
},
required: ["title", "body", "thumbnail"],
description: "Information of the article to create.",
additionalProperties: false,
},
},
},
],
output: {
$ref: "#/$defs/IBbsArticle",
description: "Newly created article",
$defs: {
IBbsArticle: {
type: "object",
properties: {
id: {
type: "string",
format: "uuid",
title: "Primary Key",
description: "Primary Key.",
},
created_at: {
type: "string",
format: "date-time",
title: "Creation time of the article",
description: "Creation time of the article.",
},
updated_at: {
type: "string",
format: "date-time",
title: "Last updated time of the article",
description: "Last updated time of the article.",
},
title: {
type: "string",
title: "Title of the article",
description:
"Title of the article.\n\nRepresentative title of the article.",
},
body: {
type: "string",
title: "Content body",
description:
"Content body.\n\nContent body of the article writtn in the markdown format.",
},
thumbnail: {
oneOf: [
{
type: "null",
},
{
type: "string",
format: "uri",
contentMediaType: "image/*",
},
],
title: "Thumbnail image URI",
description:
"Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
},
},
required: [
"id",
"created_at",
"updated_at",
"title",
"body",
"thumbnail",
],
description:
"Article entity.\n\n`IBbsArticle` is an entity representing an article in the BBS (Bulletin Board System).",
additionalProperties: false,
},
},
},
description:
"Create a new article.\n\nWrites a new article and archives it into the DB.",
},
{
name: "update",
parameters: [
{
type: "string",
format: "uuid",
description: "Target article's ",
},
{
$ref: "#/$defs/PartialIBbsArticle.ICreate",
description: "New content to update",
$defs: {
"PartialIBbsArticle.ICreate": {
type: "object",
properties: {
title: {
type: "string",
title: "Title of the article",
description:
"Title of the article.\n\nRepresentative title of the article.",
},
body: {
type: "string",
title: "Content body",
description:
"Content body.\n\nContent body of the article writtn in the markdown format.",
},
thumbnail: {
oneOf: [
{
type: "null",
},
{
type: "string",
format: "uri",
contentMediaType: "image/*",
},
],
title: "Thumbnail image URI",
description:
"Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
},
},
description: "Make all properties in T optional",
additionalProperties: false,
},
},
},
],
description:
"Update an article.\n\nUpdates an article with new content.",
},
{
name: "erase",
parameters: [
{
type: "string",
format: "uuid",
description: "Target article's ",
},
],
description: "Erase an article.\n\nErases an article from the DB.",
},
],
options: {
separate: null,
},
};
__typia_transform__llmApplicationFinalize._llmApplicationFinalize(app, {
separate: (schema) =>
ChatGptTypeChecker.isString(schema) &&
schema.contentMediaType !== undefined,
});
return app;
})();
console.log(app);
https://github.com/samchon/openapi
@antoniomdk, @bradleat Visit and check the IChatGptSchema structured please.
The next version would be published at next week.
Upgrade to v7, then everything be solved.
- https://typia.io/docs/llm/application/
- https://typia.io/docs/llm/parameters/
- https://typia.io/docs/llm/schema/