godot
godot copied to clipboard
Improve global bone pose calculation in `Skeleton3D`
This change introduces a way to calculate global bone poses without recalculating all bones every time a pose changes. It defines the bone hierarchy as a nested set. This makes it possible to represent bone subtrees continuously within an array. With regard to the introduction of SkeletonModifier3D, this should improve performance when using many modifiers that depend on global poses.
Two fields are added to Skeleton3D::Bone:
nested_set_offset: defines the position within the nested set[^1]nested_set_span: defines the size of the subtree (the bone itself and all descendants)
[^1]: For imported scenes, nested_set_offset seems to be the same as the bone index. Presumably bones are already ordered this way. However, this may be different for manually created skeletons.
This shows the nested set values for a simplified skeleton:
How it works
Skeleton3D::bone_global_pose_dirty is an array containing a flag for each bone indicating whether the global pose needs to be recalculated. It can be indexed with Bone::nested_set_offset, followed by the descendant bones.
When changing the pose of a bone, the dirty flags of its subtree need to be set to indicate that the global poses of the bone and its descendants need to be recalculated. This is done by setting all flags in the range from Bone::nested_set_offset with length Bone::nested_set_span to true. This has some overhead[^2]. But it can be skipped if the bone is already dirty at index Bone::nested_set_offset. Because if a global pose of a bone is dirty, all of its descendants are dirty too.
[^2]: It could be implemented as some kind of bit field structure instead of an array, if it turns out that performance is a problem at that point.
Getting the global pose with get_bone_global_pose
When requesting the global pose of a bone, only the global poses of its parents that are dirty are recalculated and the dirty flag is cleared. In the best case (if the parent global poses have already been calculated) only a few global poses need to be recalculated.
Recalculating global poses with force_update_bone_children_transforms
By looping through Skeleton3D::nested_set_offset_to_bone_index it is possible to update the bone hierarchy without the need for recursion. By checking the dirty flag at the current index, only required bones are updated. The order also ensures, that parent bone poses are calculated before child bone poses.
Benchmarks
| Test | Before | After |
|---|---|---|
| get_global_poses | 12.67ms | 12.61ms |
| set_all_poses | 49.83ms | 51.98ms |
| set_all_poses_reverse | 87.56ms | 89.03ms |
| set_and_get_global_poses_forward | 747.6ms | 27.8ms |
| set_and_get_global_poses_reverse | 788.4ms | 57.37ms |
| set_and_get_some_global_poses | 29.64ms | 15.22ms |
Note: I ran the tests individually. Some values were way too high even on master when run in series.
TODO
- [x] Improve performance of
get_bone_global_pose - [x] Improve performance of
force_update_bone_children_transforms - [x] Provide some performance tests
- [x] Thread safety?
Tested on Animation_test.zip.
- Master: 70 fps
- This PR: 75 fps
As for thread safe (as I understand it you are talking about using the thread_local keyword) it is multi-thread safe. Although I do not think that this can cause bugs, since the multi-threaded scene tree is not implemented.
As for thread safe (as I understand it you are talking about using the thread_local keyword) it is multi-thread safe. Although I do not think that this can cause bugs, since the multi-threaded scene tree is not implemented.
I was wondering if there is a problem now that the logic is more complex. But if the scene tree is not multi-threaded then that's ok I guess.
As far as I know multi-threaded scene tree is implemented. I'm unable to get you the details, but it was done in an earlier release disabled by default.
As far as I know multi-threaded scene tree is implemented.
Right, the implementation is here tho, it would be cool in look in this for skeleton, anim player/tree multithreading improvements. https://github.com/godotengine/godot/pull/75901
Is this ready for review? I noticed it is still in draft stage but all the items are checked.
Is this ready for review? I noticed it is still in draft stage but all the items are checked.
Yes, I forgot.
Fixed conflicts.
Thanks!
@detomon Do you think this can be done to the general 3d node caching system too?
@fire I'm not too familiar with how the Node3D hierarchy works. I will look into that.
https://github.com/godotengine/godot/blob/master/scene/3d/node_3d.cpp#L38