godot icon indicating copy to clipboard operation
godot copied to clipboard

Hardening of `gdextension_interface.json` specification

Open Bromeon opened this issue 2 weeks ago • 3 comments

Tested versions

Updated version from PR #113697.

Issue description

Not directly a bug, but several things in the new JSON format (#107845) for the GDExtension specification might need some clarification before we go stable. It's fine if we leave parts unaddressed, but I'd like to make sure those are conscious decisions and not oversights 🙂


1. Handles: normal, const, uninitialized

{
    "name": "GDExtensionVariantPtr",
    "kind": "handle",
    "description": [/* ... */]
},
{
    "name": "GDExtensionConstVariantPtr",
    "kind": "handle",
    "parent": "GDExtensionVariantPtr",
    "const": true
},
{
    "name": "GDExtensionUninitializedVariantPtr",
    "kind": "handle"
},

There is a structural relation between ConstPtr -> Ptr, but not between UninitializedPtr -> Ptr. I wonder if we could model this directly?

{
    "name": "GDExtensionVariantPtr",
    "kind": "handle",
    "relations": {
        "const": "GDExtensionConstVariantPtr",
        "uninitialized": "GDExtensionUninitializedVariantPtr"
    }
},

Or also the inverse way, if the const information needs to be self-contained within GDExtensionConstVariantPtr.

But important is that uninitialized is mapped somehow. This allows code generators to create dedicated conversion functions between those types.

2. Enum size

C and C++ standards do not strictly specify enum size. The compiler must choose an underlying type that is big enough to represent all values. While this is typically uint32_t, it doesn't have to be.

For an ABI, it's however crucial that this is clearly specified. If the representation is the same for all enums (int32_t or uint32_t), it can be defined externally. If it varies between enums, we can add an optional underlying_type field, falling back to the external default.

3. Integer size

Even though most APIs use strictly-sized types, some have int:

{
    "name": "GDExtensionPtrUtilityFunction",
    "kind": "function",
    "return_value": {
        "type": "void"
    },
    "arguments": [
        // ...
        {
            "name": "p_argument_count",
            "type": "int"
        }
    ]
},

Many others use GDExtensionInt which is int64_t:

{
    "name": "GDExtensionClassMethodCall",
    "kind": "function",
    "return_value": {
        "type": "void"
    },
    "arguments": [
        // ...
        {
            "name": "p_argument_count",
            "type": "GDExtensionInt"
        },
    ]
},

Then again others use int64_t:

{
    "name": "variant_construct",
    "return_value": {
        "type": "void"
    },
    "arguments": [
        // ...
        {
            "name": "p_argument_count",
            "type": "int32_t",
            "description": [
                "The number of arguments to pass to the constructor."
            ]
        },
        // ...
    ],
},

I'm not so much concerned about the 32/64-bit inconsistency, we anyway can't change that without breaking the ABI.

However, I do wonder whether:

  1. Can we rely on int being int32_t for targets that Godot is built for? Or is there a case differentiation necessary? If the latter, how do we know that without having a C compiler?

  2. Does GDExtensionInt have an actual purpose, or could we replace it with int64_t directly? Unlike GDExtensionBool or GDExtensionObjectID, GDExtensionInt doesn't represent its own "class" of types, but is used interchangeably with 64-bit integers. Removing this type would decrease the cognitive load and extra indirections in generated code.

Yes, I'm aware these problems were already present in the .h version, but the important difference was that there is typically standard tooling for C header interop with other languages, which can take care of some of the details. Now, JSON parsers have to be hand-written, and the burden of interpreting the spec correctly is on the binding authors. Either way, I think it would be good to clarify these integers.

4. Handle kind underspecified?

Opaque pointers are represented with key-value pair "kind": "handle". The term "handle" is quite abstract and can include integers, too. I get that this is the whole point, but this abstraction is leaky already: we need to know the size of the type, and it's sometimes necessary to convert between pointers of the same group.

Maybe it should be called opaque_ptr, ptr_handle or so? This would imply that it corresponds to the native pointer size on the corresponding platform. Or we clearly document that somewhere,,, :thinking:

5. References to C++ code

There are see references:

{
    "name": "variant_iter_next",
    // ...
    "see": [
        "Variant::iter_next()"
    ]
},

These all refer to C++ symbols at the moment. Is there a chance that we need cross-references between symbols in the GDExtension C API (i.e. the same JSON) in the future? If yes, can we differentiate them somehow (apart from :: :sweat:)?

6. Enum vs. flags

The header defines some flags like GDExtensionClassMethodFlags (NORMAL, CONST, STATIC, ...).

In the extension_api.json, there is a distinction between enums and flags: enums can only take exactly one of the finite states (+ future additions), while flags exist in a | superposition. I'm not sure if the distinction is relevant here? On a C level, both can be represented as integer constants...

So enum might be good enough, if we don't expect code generators to treat them any different (e.g. | operator overloading or so).

Bromeon avatar Dec 07 '25 22:12 Bromeon

Thanks for digging into this!

I think some of these should be "fixed" with documentation (which we should try to make before stable), and others are consistency issues we can work on at any point. The stuff we should try to change in Godot before 4.6-stable should be focused on the format.

But important is that uninitialized is mapped somehow. This allows code generators to create dedicated conversion functions between those types.

That makes sense!

I think we should do it like with with "const" and refer to the "parent":

{
    "name": "GDExtensionUninitializedVariantPtr",
    "kind": "handle"
    "parent": "GDExtensionVariantPtr",
    "uninitialized": true
},

We evolve the API by adding new things and deprecating old things, and a "top-down" approach would be too limiting if we wanted to add a new type to replace an old one.

Even though most APIs use strictly-sized types, some have int:

int is unfortunate - those should probably all be int32_t. At the very least, we should document int to be 32-bit.

  1. Does GDExtensionInt have an actual purpose, or could we replace it with int64_t directly?

I don't know for sure, but I suspect the idea was that GDExtensionInt was meant be the native type of an int from a Godot Variant. But it's currently used so inconsistently. This should be cleaned up at some point, but I don't think we need to do it before 4.6-stable. While we could stop using the type, we can't remove it, since it needs to remain in the legacy header.

  1. Enum size

I think we should document that int32_t is the underlying type

  1. Handle kind underspecified?

I think we should explain "handle" in docs. Even if a name change could help guess the size a little easier, I think the name "handle" is fine and this is something that should be documented anyway.

I kind of wish we had originally specified these to be 64-bit like some other APIs do. For example, in OpenXR, the handles are used as opaque pointers by runtimes (although, I don't think they have to be?), but on 32-bit platforms they use uint64_t so they are same size regardless.

But, for now, we're stuck saying they are 32-bit when built for 32-bit, and 64-bit when built for 64-bit.

These all refer to C++ symbols at the moment. Is there a chance that we need cross-references between symbols in the GDExtension C API (i.e. the same JSON) in the future?

Hm, I'm not sure. Right now, these are used as a way to keep explanation minimal here, and send folks elsewhere to get more information. I think if we wanted to explain more here, we'd probably do it in "description"?

  1. Enum vs. flags

I think calling them enums probably is enough, but we could add a "bitfield": true just in case? And, actually, if we do this, then we should document that bitfields are uint32_t rather than int32_t

dsnopek avatar Dec 08 '25 14:12 dsnopek

Thanks for the reply!


We evolve the API by adding new things and deprecated old things, and a "top-down" approach would be too limiting if we wanted to add a new type to replace an old one.

Makes sense 👍


int is unfortunate - those should probably all be int32_t. At the very least, we should document int to be 32-bit.

This should be cleaned up at some point, but I don't think we need to do it before 4.6-stable. While we could stop using the type, we can't remove it, since it needs to remain in the legacy header.

The legacy header needs to be constructible with information just from the JSON, right? If yes, then I'd do it like you say: keep the type definition around, mark it deprecated, and replace its occurrences in functions with int64_t.

While it's not absolutely necessary to clean up the types now, I do think there's value in having an initial JSON version that is easier to understand and implement. If a binding maintainer switches from .h to .json, they will likely do it as part of the Godot 4.6 cycle. And since manual type mapping is necessary (C -> binding language), it's nice if not everyone has to do research about these edge cases. Types like int32_t, int64_t etc. are straightforward and unambiguous.

I can gladly help in these efforts, btw 🙂


I think we should explain "handle" in docs. Even if a name change could help guess the size a little easier, I think the name "handle" is fine and this is something that should be documented anyway.

Sounds good, if we document it clearly 😎


I think we should document that int32_t is the underlying type

I think calling them enums probably is enough, but we could add a "bitfield": true just in case? And, actually, if we do this, then we should document that bitfields are uint32_t rather than int32_t

That sounds great, so underlying type is implied by kind="enum" and is_bitfield field.

I would call the field is_bitfield and not bitfield for consistency with extension_api.json. Maybe this also applies to const that could be named is_const like in the API JSON? Then also is_uninitialized.

Bromeon avatar Dec 08 '25 15:12 Bromeon

I created a PR based on my comment above:

https://github.com/godotengine/godot/pull/113754

However, that was before I saw @Bromeon's reply :-)

The legacy header needs to be constructible with information just from the JSON, right?

Yes. There's even the "legacy_type_name" field in there so that we can preserve some the type-o's in names :-)

If yes, then I'd do it like you say: keep the type definition around, mark it deprecated, and replace its occurrences in functions with int64_t.

I think we still need to spend some time thinking about if the GDExtensionInt type makes sense (or, at least, I do - I don't feel like I've fully evaluated the situation yet)

I would call the field is_bitfield and not bitfield for consistency with extension_api.json. Maybe this also applies to const that could be named is_const like in the API JSON? Then also is_uninitialized.

This makes sense to me! ~~I'll update my PR later~~

UPDATE: I've updated the PR!

dsnopek avatar Dec 08 '25 15:12 dsnopek