Supporting more APIs via more advanced serialization options
Hello 😊
I am working on creating a Jai client for the OpenAI API, but it seems its currently not possible to represent the required format for them. I am happy to send a PR your way, but wanted to discuss the approach first.
The core of the OpenAI chat completion API can be represented as:
Msg :: struct {
role: Role; // string
content: []Msg_Content;
}
Msg_Content :: struct {
type: Msg_Content_Type; // string
union {
text: string;
image_url: struct{
url: string;
detail: Img_Detail = IMG_DETAIL_AUTO; // string
};
input_audio: struct{
data: string;
format: Audio_Format = AUDIO_FORMAT_MP3; // string
};
}
}
So (part of) the JSON sent to the API might look like:
"messages": [
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": [ { "type": "text", "text": "Hello there, ChatGPT" } ]
}
]
Now the issue is that if you serialize this kind of struct the output JSON will include all the union fields, and we have no way of filtering them out at serialization time.
You can represent this another way, which is to have content: []*Msg_Content_Base and to have a struct per union field, but currently the serializer doesn't cast the pointer to the derived type (not sure if even possible) and so we only get the base.
A few solutions come to mind (but please feel free to suggest others):
-
Add a new note (or extend the ignore one) that accepts member info (like ignore) and the data of the parent struct being serialized. This way, we can ignore not only based on type, but based on the value of different fields in the struct. One can use this to makes both pointer and unions work here.
-
Ability to use custom serializers for certain types, probably by 'registering' a type serializer with Jaison. To avoid the user having to implement tons of code, we can make it a procedure that returns
Any, which then gets serialized normally by Jaison. With this, we can implement the OpenAI API by taking the base pointer and returning the derivedMsg_Content_*type, which allows Jaison to serialize the full info. Another thing this allows us to do is to say make allUUID :: [16]u8members get serialized as strings by the custom serializers generating and returning a string to Jaison.- A similar thing is needed for parsing as well to ensure we support all kinds of API responses.
Obviously, those two are complementary and each gives an ability the doesn't. Either would allow us to support the OpenAI API, but we probably want both for a full backend (e.g. it would make it convenient to have response structs use Apollo_Time and UUID internally, but automatically convert into unix time and strings to the outside world at serialization time).
Looking forward to hearing your thoughts :)
Edit:
By modifying ignore to take the data pointer (2-3 line change), this now works to print the correct struct:
custom_ignore_by_note :: (member: *Type_Info_Struct_Member, data: *void) -> bool {
UNION_OFFSET_IN_BYTES :: #run -> int {
members := type_info(Msg_Content).members;
for members {
if it.name == "" && it.type.type == .STRUCT && it.flags == .USING return it.offset_in_bytes;
}
assert(false, "failed to find start of the union block of the OpenAI Msg_Content struct");
return 0;
}
for note: member.notes {
if note == "JsonIgnore" return true;
if note != "openai_msg_content_union" return false;
msg := (data - UNION_OFFSET_IN_BYTES).(*Msg_Content);
if msg.type == {
case MSG_CONTENT_TYPE_TEXT;
return member.name != "text";
case MSG_CONTENT_TYPE_IMG_URL;
return member.name != "image_url";
case MSG_CONTENT_TYPE_INPUT_AUDIO;
return member.name != "input_audio";
case;
log("unhandled OpenAI Msg_Content_Type '%' type in custom_ignore_by_note", msg.type, flags=.ERROR);
return false;
}
}
return false;
}
Now its a bit dangerous and painful to get the proper parent offset (as union members have an offset of zero), so if we implement this its better to send both member and parent data pointers.
Using this for a bit, its clear this setup allows us to get very far, as I have now implemented the core of OpenAI and Anthropic APIs, although not very ergonomically (too many unions and custom ignores). I would also suggest we do the same (pass data pointer) to the renamer, as you do need similar flexibility there as well.
Hmm… this is an interesting problem.
The simplest way to do this with the current feature set is to use a JSON_Value for content:
Msg :: struct {
role: Role;
content: []JSON_Value
// or:
content: JSON_Value
}
This works for both serializing and deserializing JSON data with a flexible or changing format.
The other approach would be to introduce a remap argument of the form:
remap_fn :: #type (member: *Type_Info_Struct_Member, data: *void) -> (data: *void, data_type: *Type_Info);
that does the transformation on the fly. But I don’t really like that approach. I haven’t tried it, but I suspect writing these remap functions would be error-prone and hard to maintain.
I’d rather use the JSON_Value approach and write a handful of helper functions that generate a JSON_Value from a more strongly typed content variant and vice versa.
Regardless of which approach you take, it might be worth to add Any support for serialization! Then you could at least skip generating a JSON_Value from a concrete message type and just directly use the message data pointer.
(The deserialization could automatically generate a JSON_Value when it encounters an Any member, so you can re-use the same message struct for both directions. But apart from that it doesn’t offer much benefit.)
Interesting ideas 🤔
Let me share my goals so perhaps we can have a shared vision to discuss and work towards.
The core of what I want is: for Jai to become a serious contender for backend/web development
For this to happen I think we need a few important things:
- Feature complete (i.e. can represent and (de)serialize any JSON)
- High performance JSON (de)serialization
- At least as ergonomic as other languages (e.g. Go), if not more
Now ideally you get all 3, but I think feature completeness and being ergonomic are the most critical two. I can always add another server if I want scale, but I can't make developers use the language if they have to do tons of extra functions every time they integrate an API.
Potential solutions:
- JSON_Value. I don't know how to feel about this tbh, I personally always prefer typed stuff. This also means that we need to do extra functions to read/write our structs, and prevents us from supporting multiple serialization options, we are vendor-locked-in to JSON.
- Remap. I kinda like this, but maybe its too low level to work for all cases? Like what if two fields have the same name in main struct and nested one? (can we detect that?). Also, would rather avoid having potentially tens of function calls each time we (de)serialize, if possible.
- Just like Go, provide the ability to provide struct-level serialize/deserialize functions. I like this because it allows serialization to involve logic across fields, making it very flexible. How it's usually used is by making a custom inline struct, like so: https://gist.github.com/bojand/5535e81bfca5d9a89f1b4078f229e1ea.
serialize :: (val: Any) -> json: string, ok: bool
deserialize :: (json: string) -> val: Any, ok: bool
- (inspired by
Json Writer/Readerfrom C#) allow users to 'submit' key/value pairs to be written to the final string. Perhaps combining this with the previous point can give us a flexible interface while also minimizing work on the user?
write_kv("x", 5);
write_kv("y", "jaison");
write_kv("z", my_struct.sub_struct);
A few thoughts on ergonomics:
- Having one struct to represent a request/response is a huge plus.
- We should avoid structs getting invalid results because someone didn't pass the proper remap/serialize/etc function when (de)serializing. Ideally, structs can register/contain serialization functions that are used by default, but that can also be overwritten.
I am leaning towards a registered function (to provide defaults) for a type with a field read/write interface (which is probably easy to do with the current Jaison setup). Most users will use the current functions, while advanced users register a single function, get the value as input, and use the writer interface to serialize.
P.S. We need to support custom (de)serialization on the Type (not struct) level. Simplest example is needing all UUIDs ([16]u8) to be serialized as strings across all types. We don't want to write custom serialization for every struct that has a UUID member!