glTF Define a way to provide unique identifiers for glTF data (nodes/etc)

Continuing from the discussion here: https://github.com/KhronosGroup/glTF/issues/1051#issuecomment-1744560817

The glTF standard does not currently endorse any particular way to define unique identifiers. There are no UIDs, nothing beyond names is provided to identify glTF objects. This can make it tricky for applications to deterministically keep track of objects in glTF files. The problem is not isolated to game engines, it also affects glTFX (formerly glXF) files.

The problem:

You import a glTF scene into a game engine with node "MyNode".
In-engine, you alter this node, such as by adding children, changing the materials, etc.
In your modeling application, you rename to "OtherNode", or reparent to "Parent/MyNode", and re-export a glTF file.
When the game engine imports this again, it will look for "MyNode" but not find it, so it will not be able to tell where to put the added children or custom materials, so they will be discarded, and they will have to be applied again.

Some options for solutions: (EDIT: Removed number 2)

Recommend using node names as unique identifiers, and do not add a UID property. This is in line with what glTFX already does, it can reference nodes in a glTF scene by unique name.
- Minor note: Godot already enforces unique node names, but in addition Godot also needs the path to match to be considered the same node, it doesn't follow a node's name around the tree.
- Implication: This would mean that the display name of a node must not change.

"nodes": [
    {
        "name": "Block", // No other node may have this name, and it is expected to not change.
        "mesh": 0
    }
]

Create an extension KHR_unique_id that has one property: "uid" (and for glTFX, a way to refer to these).

"nodes": [
    {
        "name": "Block",
        "mesh": 0,
        "extensions": {
            "KHR_unique_id": {
                "uid": "ab2e0958-b67d-4f28-8f6f-22e41c23a4cc"
            }
        }
    }
]

Add a "uid" property to the base "core glTF" spec itself. This is probably not the preferred solution given that the glTF spec is pretty much frozen, but if we did add this, it wouldn't break either forward or backward compatibility because it is optional, so it could be done in a hypothetical glTF 2.1 (no need to wait for a hypothetical glTF 3.0).
EDIT: Another option, currently my favorite. This is a combination of ideas 1 and 3. Basically, we have an extension that enforces the constraint that glTF files have unique identifiers. By default, use the name as the unique identifier (idea 1), but optionally we can supply a separate UID (idea 3). This allows retroactively adding UIDs to files that didn't start with one (ex: node "A", renamed to "B" with UID "A", then it can track that as the same node).

Regardless of which option is chosen:

We should define the scope of what glTF intends to support.
- Is pointing to a node or resource inside of a glTF something we want to support?
- Is keeping track of unique identifiers something we want to have in glTF? Why isn't the name suitable?
- Is it acceptable that changing a node name or path can break applications trying to reference that node?
- Is it sensible for applications to "look for" particular parts of a glTF file which it is "tailored" to have, or does this relationship create a new problem of "My specific game/engine knows what to do with this specific model, when it contains a node that has this specific ID"? (see here)
- Is modifying a glTF scene after import in a game engine considered an important use case? (I think yes) (see here)
We must note that using this field is optional. This field is only to be used when the content creation program has its own UID system and therefore exporting UIDs can be done deterministically. We don't want exporters to generate random UIDs, as that would defeat the whole point of UIDs.
We should note that UIDs are not guaranteed to be unique between different glTF files, and in fact for the UIDs to function as expected they should keep the same UIDs between different versions of the same file.
We should ensure the glTFX format is able to refer to nodes using this UID (in addition to the glTFX itself being able to define UIDs for its nodes in case anything wants to reference the contents of a glTFX).
We should note that UIDs are specifically only useful when the loader already knows what to expect when loading, such as in a game engine or in a glTFX file. UIDs are completely useless for dynamically loading arbitrary content at runtime in most applications which are not looking for UIDs.
We must work toward solving all of these questions in good faith.

Oct 16 '23 18:10 aaronfranke

Sound like a great idea

Oct 16 '23 19:10 apostrophedottilde

This would be a huge leap forward for 3D workflows in Godot

Oct 16 '23 19:10 slumberface

There is already a generic way of specifying metadata everywhere in a glTF. IT is a little cumbersome, but it does technically have this capability. It is via this already approved extension: https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_xmp_json_ld/README.md

What you would do is create a bunch of xml packages at the top level, each one containing metadata, and then reference them from each node or material.

Would that be possible? The idea was to avoid extensions that just add metadata and unify them into this single extension.

Oct 16 '23 19:10 bhouston

@bhouston This would be a single string per node, XML is vastly overkill for this.

(Also note: I have a Godot implementation of KHR_xmp_json_ld, but I can't say that I'm a fan of the spec...)

Oct 16 '23 19:10 aaronfranke

I think this may also be relevant: https://github.com/KhronosGroup/glTF/blob/main/specification/2.0/ObjectModel.adoc It's more about "runtime properties" but since the goal is "identifying asset properties" there may be overlap.

Oct 16 '23 19:10 hybridherbst

@hybridherbst wrote:

I think this may also be relevant: https://github.com/KhronosGroup/glTF/blob/main/specification/2.0/ObjectModel.adoc

Unfortunately it doesn't add any new identifiers such as uuid.

@aaronfranke wrote:

@bhouston This would be a single string per node, XML is vastly overkill for this.

This doesn't involve XML, rather it uses XMP. This spec defines all metadata at the top level and then you reference it from the resource you want to use it from. This allows you to reference duplicated metadata on multiple nodes if you wanted to, but in this case you want unique metadata per node but it stills support that use case as well. This isn't that hard to use.

At the top level you would define your UUIDs:

"extensions": {
  "KHR_xmp_json_ld": {
    "packets": [
      {
        "@id": "",
      },
     {
        "@id": "",
      },
     {
        "@id": "",
      }
   ]
 }
}

And then in each node, mesh, texture or material or scene you would do:

  "meshes": [
    {
      [...rest of mesh definition...]
      "extensions": {
        "KHR_xmp_json_ld": {
          "packet": 0
        }
      }
   },
  {
     [...rest of mesh definition...]
      "extensions": {
        "KHR_xmp_json_ld": {
          "packet": 1
        }
      }
    }
 ]

Oct 16 '23 19:10 bhouston

@aaronfranke do you understand the example usage I have outlined above using the existing extension? It isn't that much work to do. I think it would work perfectly for your use case, no? I bet you could write support for it in a hour or less.

EDIT: I think that maybe the "dc:identifier" may be a better top level name than "@id"?

https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/elements/1.1/identifier

Oct 16 '23 19:10 bhouston

I really think a dedicated extension is probably better for this, to better encourage other 3D DCCs to implement this functionality.

Oct 16 '23 20:10 reduz

It MUST be integrated into Godot and i fully support this !

Oct 16 '23 20:10 SilverWolveGames

A unique indentifier should be present in absolutely everything that deals with transfer and storage of data, and it has to be a first-class property, not an aftertought. Yes, the KHR_xmp_json_ld could work, but its non-intuitive, the "packet" number is the index of where the "packet" is in an ordered list, which holds a json with arbitrary names and values, its an addon, which means that you are adding friction between the "need" and the "solution", since it is not readily available, and lastly

warning: pseudocode/python ahead.

pakcets = gltf["extensions"]["KHR_xmp_json_ld"]["packets"]

for mesh in gltf["meshes"]:
    index = mesh["extensions"]["KHR_xmp_json_ld"]["packet"]
    print(f'mesh id = {pakcets[index]["dc:identifier"]}')

I will be 100% honest here, this is not an acceptable way to access a unique identifier, what does even mean "dc:*"? This adds unnecessary clutter, the way to grab the information is noisy and convoluted. This is not something that is reasonable in any situation.

Compare with this:

for mesh in gltf["meshes"]:
    print(f'mesh id = {mesh.get("uid")}')

Oct 16 '23 20:10 Scoppio

I also support the idea of this becoming its own extension. It would be awkward that tool A exported uids via some raw metadata field and then tool B checked precisely those fields that happen to have a very specific meaning. It would be like having a de facto extension, only not specified.

Oct 16 '23 20:10 RandomShaper

Right, I mean, this is something that once defined every game engine and a lot of DCCs will be happy to implement. It is precisely why it should be standardized.

Oct 16 '23 20:10 reduz

Yes please!

Oct 17 '23 03:10 FedTheCat

Yes!

Oct 17 '23 13:10 mrussogit

The first comment links to an issue that already contains some discussion, and it summarizes some of the discussion points, clarifications, and open questions from there. And judging from the high number of 👍 's (and comments, even though many of them do not go beyond the meaning of a 👍 ), there seems to be a high demand for this feature. But I think that it is really important to be clear about the scope, intended use, and behavior of such an extension. Therefore, I will ask "Devil's advocate" questions (again). I'm asking these with the goal of fleshing out what could eventually become the 'Introduction' section of an extension specification, and make sure that everybody is on the same page (and I hope that there will not be toooo many people accusing me of being stupid and/or dismissive for asking these questions...)

Referring to the high-level use-case description:

You import a glTF scene into a game engine with node "MyNode".

In-engine, you alter this node, such as by adding children, changing the materials, etc.

In your modeling application, you rename to "OtherNode", or reparent to "Parent/MyNode", and re-export a glTF file.

When the game engine imports this again, it will look for "MyNode" but not find it, so it will not be able to tell where to put the added children or custom materials, so they will be discarded, and they will have to be applied again.

This sounds like something that is mainly (or only?) relevant during the authoring phase of an asset.

(NOTE: I'm aware that glTF - even though it is primarily intended as a 'last mile' delivery format - is used in authoring workflows. And I acknowledge that unique identifiers could be useful for supporting the authoring workflow. The following is really about the scope, "management", and intended usage of these IDs).

The assumption that this is mainly relevant during the authoring seems to be confirmed by this point:

We should note that UIDs are specifically only useful when the loader already knows what to expect when loading

A model is imported into an engine. And this model is expected to contain a node ("MyNode") that is supposed to be identified with an ID. Who is responsible for establishing the connection between the IDs that have been assigned by the modeling application and the use of these IDs in the engine, on a technical level? In how far do the authoring application and the engine have to "know each other" (and the IDs that they are assigning and expecting)? Or to put it that way: The description sounds like this only refers to the case where the engine imports a model, and it should be possible for the engine to assume that ...

this is "the same model" that it had imported previously (from the same authoring application?)
the model was not processed in a way that did destroy or modify one, specific ID

Is that correct?

(If it is correct, then some constraints for a possible specification can be derived from that. For example, that no engine may assume the presence of IDs to begin with, and that authoring applications may never modify IDs that are already there. It could be emphasized that the extension/IDs are only useful within that "authoring cycle" of editing/importing models between authoring applications and engines that both support the extension in exactly this way)

Another important question is: What are engines allowed to do based on these IDs?

For example: An engine finds a certain mesh primitive, based on its ID. And it could assign a new material to this mesh primitive. This would mean that the appearance of that glTF asset does no longer depend on the glTF asset itself, but on the presence of a certain ID (and the specific engine that is importing it).

The follow-up question would be: Shouldn't it be possible to eventually "bake" these modifications into the asset itself, after the authoring process is 'finished'? (Meaning that at the end, one could also remove all IDs?) Otherwise, glTF may lose some of its portability. Specifically: I wonder which aspects of the current portability of glTF might be endangered by ~"modifications of the asset that are based on the presence of certain IDs"...

Again: I'm not opposed to introducing identifiers (for example, to support common authoring workflows). But it should be made clear which kinds of behaviors and interconnections between authoring applications and engines are expected to (or rather: "allowed to") be established with these IDs.

Oct 17 '23 14:10 javagl

@javagl

What we want is to be able to transfer modifications made to an scene imported from a glTF to another scene imported from a newer version of the glTF.

Ideally we would not create a new scene from the new version of the glTF, instead we would reuse the existing objects, something like this:

Fist the glTF scene is imported into the engine, this requires parsing it, and building an representation of it with engine specific types. During this process the uid would be stored associated with said representation (either as part of it, or in an auxiliary data structure).
Second, the user would do modifications to the imported scene, such as adding engine specific materials (i.e. non-PBR materials made in engine), adding physics, composing multiple scenes (including bone attachments), and other engine specific medications. These modifications are not done to the glTF, but the objects that were created from it. Since these modifications are engine specific it does not make sense to author them in another software or to transmit them from software to software.
Third, a new version of the glTF is imported. This will also be parsed, but instead of creating new objects we want to reuse the existing ones with their engine specific modifications. For this, the uids allow to identify nodes even if they have been renamed or reparented.

Assume the glTF is authored by a different person to who is manipulating it in engine. While this is not necessarily true, since these are different know-hows it is common that they are done by a different person.

I hope this makes sense.

Oct 17 '23 15:10 theraot

@javagl I agree this is only useful for authoring. But reality is that GLTF is used hugely for this. It is a very fast format to export (due to its binary nature) and reimport into an engine.

Most game engines do not use GLTF natively though, they only use it for opening the assets, then use their own formats to export. This is the case of Godot and pretty much any other engine I can think of.

My feeling is that, if you are making a game engine and you want to actually ship the very same GLTFs, then its up to you to clean it up.

GLTF already provides a lot of information that may be redundant (as example, names), so its up to the engine do do this clean up process when shipping.

Oct 17 '23 15:10 reduz

@theraot and @reduz

That description sounds reasonable, and seems to be in line with what someone (maybe even one of you, I'd have to look it up) said in the linked discussion. Roughly: The glTF itself is used only as a basis for building the "engine-specific asset". And this may even be stored in an engine-specific asset (file) format, which only refers to the glTF as its input. In this case, of course, any engine-specific additions have to know which element of the glTF they refer to.

One specific example (just to get an idea of whether I got the intention right): There might be a glTF asset with some avatar/character, and it has identified elements like 'head, torso, armL, armR, legL, legR'. The engine-specific asset uses this as the basis for building some ~"physics computation structures" (say, for some ragdoll effect, or something else that cannot be modeled with glTF animations).

In this case, the base glTF asset would not even be intended to be "portable" any more (not beyond showing that avatar in the T-pose in which it was originally authored). It might therefore be that some of my concerns are negligible in the "real world".

But I'd still wonder how to avoid "obscure" usages of these IDs. I know, this is kind of a "worst case" scenario, but people could just throw a bunch of meshes (that are not attached to nodes) or materials (that are not associated to meshes) or just empty nodes that only contain IDs (!) into an asset, and say: Yeah, my engine knows how to assemble something that makes sense from these fragments, based on their IDs.

Right now, I can drag-and-drop any GLB file into Three.js, Babylon.js, Filament, PlayCanvas, Cesium.js, the Khronos glTF Viewer, ClayGL, Hilo3d, RedCube.js, and any other glTF viewer, and they will all behave the same way, and show the same result ... mostly: Achieving portability is an important goal of glTF, ranging from the lowest level of mathematical details of the specification of PBR, up to the question of what the (visual) "ground truth" actually is (see for example https://modelviewer.dev/fidelity/ ).

We should be careful to keep these goals in mind, and in the context of a possible specification of 'unique identifiers', we should clearly describe their intended use (with the focus on the "authoring cycle") and the limits (what they should not be used for, to ensure that glTF keeps its portability).

Oct 17 '23 15:10 javagl

I think one aspect here is that unique IDs in glTF would allow for one – and only one – workflow to be better:

Going from Application A (with its own internal data format)
to glTF
to Application B (with its own internal data format)

in a repeatable way while renaming/moving nodes around.

Going from Application B to Application C would have likely different unique IDs since internal data formats do not necessarily match the glTF data model (e.g. some engines have submeshes - aka multiple primitives per mesh - while others don't). In practice, importing and exporting glTF is almost never a perfect roundtrip due to these engine-specific differences. So IDs would be useful for importing and might change when exporting again unless you expect every application to somehow keep the IDs and have perfect roundtrips, ignoring their own data format.

Oct 17 '23 16:10 hybridherbst

@javagl

It is already possible to store arbitrary data in glTF. Take for example the suggested alternatives to this proposal: extra and KHR_xmp_json_ld. We could have an authoring tool and game engine combo that use glTFs that consist of empty nodes with extra or KHR_xmp_json_ld which would be unusable in any other software.

However, in practice, we do not see this. I believe the incentives are either non-existent or negligible. The general availability of glTF viewers would mitigate such attempts, as users expect glTF to be portable across software an platforms, and not being able to view a glTF in a general purpose glTF viewer would suggest that something is wrong with it.

Beyond that, if there is a solution to prevent misuse of extra or KHR_xmp_json_ld (of which I'm unaware), it probably applies to this proposal too. And I think it would be in everybody's interest to have a look at it.

@hybridherbst

The idea of a roundtrip is alluring. However it is not the goal, and would not always be possible.

While we would be interested in tools generating consisten uids for different versions of the same glTF. No tool would be required to do that, as the uids would be optional, and even generating random uids would also result in valid glTFs.

Similarly, no tool should be required to preserve uids from imported glTF when exporting them. While preserving them could be useful for some workflows, it would not always be possible (e.g. in case of importing multiple glTFs that have colliding uids and exporting them as a single glTF).

However, as game engines take advantage of uids, authoring tools that have game developers in their target audience are likely to follow. This is because game developers would rather use tools that generate consistent uids, and would request such feature.

In the topic of adoption by tools... As far as I know extra and KHR_xmp_json_ld lack semantics. Software could follow the robustness principle and look for an uid there in case some other software outputs it that way, but we would still want a recommendation of how to output uids. Without this, it would be hard to get the maintainers of existing engines and authoring tools to use uids… And if we managed to get them to use uids over extra or KHR_xmp_json_ld, it would be akin to a secret society handshake.

Oct 17 '23 16:10 theraot

In practice, importing and exporting glTF is almost never a perfect roundtrip due to these engine-specific differences. So IDs would be useful for importing and might change when exporting again unless you expect every application to somehow keep the IDs and have perfect roundtrips, ignoring their own data format.

The term "roundtrip" also came up in the earlier discussion (and it was said that this was not the primary goal). But in view of the alternatives that have been mentioned here (name, extras, and KHR_xmp_json_ld), one question for a specification of an extension could be:

In how far would these IDs go beyond what an authoring application and an engine could achieve with a bilateral agreement?

The application and the engine could just agree to store the ID, as a string, in the name property. And if the IDs should go beyond that, then I think that "going beyond that" does exactly mean that there are constraints, on the level of the specification, clearly stating the expected behavior for generating and consuming these IDs, in different workflows.

For example, the specification could require a roundtrip capability for the most simple case (that does not involve editing), by stating: "Importing (then not modifiying) and exporting an asset MUST keep the original IDs". Of course, there are many cases to consider: What if multiple assets with conflicting IDs are merged, and the result is exported again? All this could go down into very nitty-gritty details, e.g.: What if an authoring application just swaps two nodes? Do they keep their IDs (because they are still the same nodes), or do they have to receive new IDs (because they have new parents)?

I think that some/many of these constraints could only sensibly refer to authoring applications in particular. For example, one could expect that Blender builds sophisticated (authoring-oriented) structures that allow to keep track of all IDs that have been read from the input. In constrast to that, a glTF loader library in a game engine might have methods to read glTF into the engine-specific 'model' object, or write such a 'model' as a glTF, but it might perform operations (e.g. optimzations, like dropping "unnecessary" data, like the IDs themself, unused materials, or nodes in node chains) that make it impossible to reconstruct the original IDs.

Oct 17 '23 17:10 javagl

@bhouston I understand perfectly what you mean, but I completely disagree. It's not a helpful layer of abstraction. Using KHR_xmp_json_ld will just increase complexity and file size for absolutely zero benefit. The whole point of putting data in top-level arrays is sharing data, but UIDs are unique, so they can never be shared. What specifically is the problem you are trying to solve here (see this chart)? The only argument in favor of using KHR_xmp_json_ld is a misguided belief that we need to unify all data under it in one format. But we already have a unified data format, it's called JSON, we can store metadata without extensions in "extras".

The follow-up question would be: Shouldn't it be possible to eventually "bake" these modifications into the asset itself, after the authoring process is 'finished'?

This is not always possible. For example, setting a material may use an engine-specific material type that has no way to be saved to glTF. Or, you may attach a script written in that game engine's programming language using that game engine's API, which can never be fully portable to glTF.

Roughly: The glTF itself is used only as a basis for building the "engine-specific asset". And this may even be stored in an engine-specific asset (file) format, which only refers to the glTF as its input. In this case, of course, any engine-specific additions have to know which element of the glTF they refer to.

Yes, this is precisely the idea. Well, ideally, the glTF is a very large part of the asset, like 90% or more. This way when you need to have some data specified in your engine-specific format, you can have 90% or more of your asset be a portable glTF, instead of 0% (with all data being stored in the engine-specific format).

In this case, the base glTF asset would not even be intended to be "portable" any more

Not necessarily. The base asset could be portable, but it would just be missing whatever functionality you added in-engine. For example, when making an avatar for VRChat, you start with the base model with a mesh, skeleton, materials, etc, and may replace materials, add new functionality (like spring bones for dynamic hair/tails/ears/etc), etc. The base model is still fairly portable, it can still be moved to other apps, used with its skeleton etc, but it will just be missing the final last-stage tweaks (like the hair will be static relative to the head).

Of course ideally we should continue building standards to allow specifying more and more of the data in the glTF file itself (for example, the VRM consortium has a glTF extension for spring bones), but there will always be application-specific needs that go beyond what's standardized.

Oct 17 '23 17:10 aaronfranke

@javagl

The application and the engine could just agree to store the ID, as a string, in the name property.

To quote JonathanDotCel's https://github.com/KhronosGroup/glTF/issues/1051#issuecomment-1746519040:

names are for people and GUIDSs are for machines.

Since the engine builds and shows a very close representation of the glTF structure, we have the expectation that the name carries out. If it is the name what is held unique and unchanging, we want a display name property, so we would be talking about adding a new property any way.

What if an authoring application just swaps two nodes? Do they keep their IDs (because they are still the same nodes), or do they have to receive new IDs (because they have new parents)?

If they have uids, I'd expect them to keep their uids. If they were to receive new uids because being reparented that would defeat the purpose.

Yes, there would be workflows. The following is what comes to mind:

Ignoring the uids on import. And not outputting uids, would be equivalent to what already exist.

Authoring tools could also create new uids for new nodes, keep track of all of them, and persist them in their own authoring format, so they can be consistent across multiple exports to glTF.

The authoring tools could also preserve uids of imported glTF. And here we run into possible conflicts (see below).

Scrapping the uids on export is also OK. For example, when exporting the final game, as stated.

However, game engines could also keep track of uids to be able to match them among multiple versions of the same glTF.

The following are my ideas of what to do with conflicts:

Under the premise that exported uids are not required to match the imported uids, the authoring tool can replace conflicting uids with new ones. Doing this without losing track of the original uids suggest that the new uids should be derived from the original uids plus some reference to the source glTF. Two strategies comes to mind:

If arbitrary strings are allowed, the uids could be prefixed with an id of the original glTF.
Otherwise, the new uids could be derived using a hash based algorithm.

In eiher case, it seems that some id for glTFs as a whole would be useful to solve conflicts. I'm inclined to believe that authoring software could use (a hash of) the (relative) file path for this purpose. I'm suggesting a hash here as to not embed potentially confidential paths, and I'm suggesting relative paths to allow moving the files of the project as a whole without breaking these ids.

Oct 17 '23 18:10 theraot

Shouldn't it be possible to eventually "bake" these modifications into the asset itself, after the authoring process is 'finished'?

This is not always possible. For example, setting a material may use an engine-specific material type that has no way to be saved to glTF.

One could consider to explicitly recommend to not use the IDs for a purpose that can be achieved with pure glTF. For example, they could be used to assign "physical material properties" to meshes, but should probably not be used to model something like parent-child relations between nodes (or anything else that can already be represented as pure glTF).

It could be hard to phrase that precisely. It can hardly be a strict requirement, because the features of glTF will be extended with ... other extensions. But it might be something on the level of a hint about the scope of the extension, like a "best practice", or an "Implementation Note"...

names are for people and GUIDSs are for machines.

Since the engine builds and shows a very close representation of the glTF structure, we have the expectation that the name carries out.

The comment about using the name was to emphasize that an extension specification will raise tricky questions (and some of them have already been mentioned in the meantime). And it will be necessary to have a clear idea about the behavior of the IDs in these cases, considering that this is supposed to not only be a mutual agreement between two parties. It has to stand the test of time, across many applications, and throughout different workflows.

What if an authoring application just swaps two nodes? Do they keep their IDs (because they are still the same nodes), or do they have to receive new IDs (because they have new parents)?

If they have uids, I'd expect them to keep their uids. If they were to receive new uids because being reparented that would defeat the purpose.

Yes, it would defeat the specific purpose of 'identifying a node regardless of its parent'.

I could now ask further (overly specific) questions: Will a node receive a new ID when a child node is attached? Will it receive a new ID when a mesh is assigned to or removed from it? Will a mesh primitive receive a new ID when its material is changed? But the obvious generalization of these questions is:

Which editing operations in an authoring application MAY/MUST cause the IDs to change, and which ones MAY/MUST NOT affect the ID?

(More technically, this could be seen as a question about the concept of 'equality'. Or more philosophically, as an instance of the thought experiment of the 'Ship Of Theseus'...)

For each answer, one could come up with scenarios. For example: If the ID was intended to identify a node that is expected to contain a mesh (let's say a node with a mesh for which physics computations should be performed), then removing the mesh will make the node "invalid" for that purpose. Iff something like this was an intended use case, then I'd be curious about the behavior that is expected from an authoring application and an engine in such a case.

(Note: These questions are not meant to dismiss the idea. And the answers to these questions may very well be "This is not relevant", or start with the usual "That depends...". They are intended to get a clearer idea about what could (reasonably) be specified for such an extension, beyond the JSON schema that says that there is some extension object with a uid: string property).

Oct 17 '23 21:10 javagl

Will a node receive a new ID when a child node is attached?

No, that would defeat the point of UIDs.

Will it receive a new ID when a mesh is assigned to or removed from it?

No. In editors where this is possible, the UID should not change because it's the same node. But also, note that in many apps like in Blender or Godot, creating a new mesh requires creating a new node.

For example: If the ID was intended to identify a node that is expected to contain a mesh (let's say a node with a mesh for which physics computations should be performed), then removing the mesh will make the node "invalid" for that purpose.

Yes, but if you are looking for a mesh and that mesh is gone, there is no possible configuration that would allow mesh modifications to be re-applied. So this is expected. Also, this scenario won't occur with Blender or Godot, because the only way to add/remove meshes to nodes is to add/remove the nodes and have them be of type mesh.

Will a mesh primitive receive a new ID when its material is changed?

No, that would defeat the point of UIDs.

Which editing operations in an authoring application MAY/MUST cause the IDs to change, and which ones MAY/MUST NOT affect the ID?

Creating a node MAY give it a UID. Or, if there's an existing file without UIDs, one may wish to add them afterwards. Once a node has a UID, it MAY be stripped for the "final product", but it MUST NOT be changed or replaced with any other editing operation. There are no valid editing operations that will result in the UID property automatically changing from one UID to another UID. If a node has a UID, it must never have any other UID.

Oct 17 '23 21:10 aaronfranke

Just for clarification,

There are no valid editing operations that will result in the UID property automatically changing from one UID to another UID. If a node has a UID, it must never have any other UID.

You do mean: in the same file in the same application, right? "Exporting the GLB and importing it again in the same software" may already result in new UIDs due to internal format differences or collisions (things that aren't nodes are turned into nodes on export and vice versa). "Exporting the GLB and importing it elsewhere and exporting it again" may also result in new UIDs.

Oct 18 '23 09:10 hybridherbst

Once a node has a UID, it MAY be stripped for the "final product", but it MUST NOT be changed or replaced with any other editing operation.

That's the core of the answer that my questions aimed at. And is the strictest requirement that can be imposed here (and that could go into the specification accordingly).

But I'm s stickler, and will ask a few more questions. I think that this is important for the long(er) term goal of a robust technical specification. Some of these questions might seem to be hypothetical (for example, because they are not applicable for one specific authoring tool - even though they may be applicable to another one). If someone thinks that this is not important, then these questions can be ignored.

There seems to be a level where we could talk about the vocabulary:

...the only way to add/remove meshes to nodes is to add/remove the nodes and have them be of type mesh.

A "node of type 'mesh'" is something that doesn't translate well to glTF. When I'm talking about a 'node', then this refers to a glTF node, as something that determines the hierarchical structure of the scene. This might be mapped to the concepts of authoring applications in different ways.

Regarding the question of 'removing a mesh from a node':

... removing the mesh will make the node "invalid"

Yes, but if you are looking for a mesh and that mesh is gone, there is no possible configuration that would allow mesh modifications to be re-applied.

One caveat here is that a 'mesh' in glTF cannot sensibly receive an ID. Of course, the mesh object itself can receive an ID, but meshes in glTF are basically instantiated when a node refers to a mesh. This raises some questions. Imagine the ID is supposed to be used to identify a mesh for which 'physics' computations should be performed (like building some collision detection data structure or whatnot). The mesh could be identified via its ID. But the authoring application could remove this mesh from a node, and assign it to a different node, or even to multiple different nodes. Should the physics data structures now be created for each instance of that mesh?

The question about 'removing a mesh from a node' could therefore be phrased in a more abstract and generic way: Who is responsible for ensuring that structural modifications (that keep existing IDs) do not interfere with the purpose of the identified element?

To emphasize this again: There might be 'simple' answers to that, like "This is not relevant for the specification", or "It's the responsibility of the person who edits the model and who has to know which modifications are allowed". And that's perfectly fine. On this level, one could say that I just want to know whether there is such a 'simple' answer or not...)

(And an aside: If someone is supposed to actually implement the handling of glTF IDs in an authoring application, there will be some questions on a far lower technical level. For example: When you delete a node and press CTRL-Z (undo) - will it be guaranteed to receive the same ID again? It should be, right? What about the squence CTRL-C-X-V-V (copy, cut, paste, paste) - what will be the IDs of the pasted elements?)

Oct 18 '23 11:10 javagl

A "node of type 'mesh'" is something that doesn't translate well to glTF.

Sure, I'm just mentioning this to argue that the case you mention will not occur in Godot and Blender. But anyway, even in applications where it can occur, it's not a problem, changing the contents of a node should still keep the UID.

One caveat here is that a 'mesh' in glTF cannot sensibly receive an ID. Of course, the mesh object itself can receive an ID, but meshes in glTF are basically instantiated when a node refers to a mesh. ... assign it to a different node, or even to multiple different nodes. ...

In Godot, this is not a problem. Mesh resources are stored in memory. If multiple nodes use the same mesh, then by default they share the same mesh resource. So a single mesh instanced multiple times is still one mesh with one object in memory and one UID (if it has a UID). I suppose this may not be the case in all apps, but most engines have the concept of instancing a mesh. What gives you the impression that glTF meshes "cannot sensibly receive" a UID?

Also, the case you mention about physics - I get what you're saying in a hypothetical sense, but in this particular case it's not a valid example, because glTF physics does not work like that (in both of the competing extensions, a physics shape may use a mesh, but they do not add any data to the mesh resource itself).

Oct 18 '23 23:10 aaronfranke

The point about the uniqueness and instancing of meshes was probably not stated properly. I'll try to be more specific. This attempt to be more specific bears the risk of being too specific, causing the response: "That's now how it is done in engine X". But the behavior and handling of IDs should be consistent across multiple applications, as far as reasonably possible (!), and within the intended use-cases. For an application-independent format, one should be able to either say 1. what the behavior should be, or 2. explicitly (!) say that a certain aspect of the behavior is not specified. (And again: that's fine. I'm trying to find 'the limits of what can be specified' here...).

A glTF file may contain a certain mesh. And this mesh has an ID. The engine knows that it should create, say, collision detection information for this mesh, based on its ID. For example, if the mesh is attached to a node with a translation of (1,2,3), then the engine may create its collision detection information (some BVH or spatial hash) specifically for the mesh at the position (1,2,3). When the mesh is attached to a different node with a different translation (in the authoring application), then the engine will build the collision detection data structure for a different location. That's fine, And one of the explicitly intended use-cases, as far as I understood. Now, when the same mesh (or rather the identical mesh, as of the meaning of 'ID') is attached to 10 nodes, then the engine would create this collision data 10 times. The point is: Is the collision detection info (or whatever should be associated with the ID) specific for the mesh instance (as it is created via the mesh: 123 reference in a node), or is it specific to the mesh itself (regardless of the nodes that it is instantiated in)? (Corollary: If a mesh should appear 2 times, but only one instance should receive collision detection information, then the ID in the mesh can not be used as a basis for the collision detection data. It has to be based on an ID in the node that refers to (i.e. "instantiates") the mesh)

Oct 19 '23 12:10 javagl

@javagl I think you are making it more complex than it need to be.

For unique IDs, to me everything should potentially be able to have it. Both instance and mesh, as well as material, texture, animation, etc.

Oct 19 '23 12:10 reduz