godot icon indicating copy to clipboard operation
godot copied to clipboard

Optimize AnimationMixer blend process

Open Nazarwadim opened this issue 1 year ago • 6 comments
trafficstars

This PR is created to optimize the AnimaionMixer _process_animation.

void AnimationMixer::_process_animation(double p_delta, bool p_update_only) {
	_blend_init();
	if (_blend_pre_process(p_delta, track_count, track_map)) {
		_blend_capture(p_delta);
		_blend_calc_total_weight();             
		_blend_process(p_delta, p_update_only);
		_blend_apply();
		_blend_post_process();
		emit_signal(SNAME("mixer_applied"));
	};
	clear_animation_instances();
}

Benchmarking methods:

I made some benchmarks of how long each of these methods takes for one 3D model, using animation_tree.

Some explanations to understand the results. Units of measurement: usec. is_process = _blend_pre_process weight = _blend_calc_total_weight

Master:

You can see here that the _blend_process, _blend_pre_process and _blend_calc_total_weight methods take the most time.

I will show the results that are in this PR.

You can see that blend_process has improved by 30%, blend calc_total_weight has improved by 15% and _blend_pre_process has improved by 50%.

Real project benchmarks:

Master: Project FPS: 56 (17.85 mspf) Project FPS: 56 (17.85 mspf) Project FPS: 56 (17.85 mspf) Project FPS: 56 (17.85 mspf) Project FPS: 55 (18.18 mspf) Project FPS: 55 (18.18 mspf)

Current PR: Project FPS: 68 (14.70 mspf) Project FPS: 69 (14.49 mspf) Project FPS: 66 (15.15 mspf) Project FPS: 66 (15.15 mspf) Project FPS: 68 (14.70 mspf)

Here you can see a pretty good + 22%. Note that #92554 also improves animation performance, which with this PR adds 40% to performance.

How to test:

Here #92554 in the benchmark section is Animation_test.zip project. After opening, fps will be output to the console.

What was done:

Animation:

  • For enums, the size was reduced from 4 to 1 byte.
  • Added the get_tracks method.

AnimationTree:

  • I made track_map a pointer in order not to copy maps.

AnimationMixer:

  • The first is to use getptr instead of has + operator[].
  • The second is to use int count = a->get_track_count(); In order to store in a register the number of iterations.
  • Used the method already created in animation.h to take the array. const Vector<Animation::Track *> tracks = a->get_tracks();
  • In the method post_process_key_value I cache whether there is GDVIRTUAL_CALL. That is, there will be only 1 check per _blend_process call.

Probably closes: #92693

Nazarwadim avatar Jun 06 '24 12:06 Nazarwadim

Your test project is not very suitable for such a test, due to the dynamic camera, and a large number of meshes, I used the project from a recent discussion (#92724), which uses 301 skeletons, and it really showed excellent results!

4.3 beta1 (master): FPS 47-48 This PR: FPS 56-58

test1 test2

I'll attach a project that is great for testing this PR: anm.zip

Let's see what @TokageItLab says about the code

JekSun97 avatar Jun 06 '24 14:06 JekSun97

I would say that by itself it won’t be enough but with the pr you mentioned it might close it as it reduces it cost a lot more(40% as you stated when combining it )

Either way this is great job and hopefully it gets merged quick in 4.4 along with your other great optimizations pr’s.

Saul2022 avatar Jun 06 '24 17:06 Saul2022

Either way this is great job and hopefully it gets merged quick in 4.4 along with your other great optimizations pr’s.

There are no compatibility problems or any major innovations here, this is just an optimization of what already exists. So I think this would work well for 4.3 as well

JekSun97 avatar Jun 06 '24 18:06 JekSun97

GLEcxYwW4AA8ufD (1) GLEcGh1XkAA2-YX (1)

Animation does not support multi-core

WOLFxxxxxx avatar Jun 29 '24 15:06 WOLFxxxxxx

@WOLFxxxxxx It has nothing to do with this PR you are talking about. This PR is an optimization for several allocations in the blending process and does not address multithreading. That will have to be done in another PR. Please confine the discussion here to the topic of the implementation that this PR does.

TokageItLab avatar Jun 29 '24 15:06 TokageItLab

@reduz I think it's time to merge it with 4.4, no conflicts related to this PR have been found

JekSun97 avatar Aug 26 '24 21:08 JekSun97

Thanks!

akien-mga avatar Sep 02 '24 10:09 akien-mga