godot icon indicating copy to clipboard operation
godot copied to clipboard

Add optimization for generating tangents in mikktspace.

Open fire opened this issue 2 years ago • 18 comments

genTangSpace was taking 70% of the time, we want to reduce it to 30% of the time.

Test Plan

Test a project with three copies of the same load. Compare before and after. Check correctness.

Production edit: closes godotengine/godot-roadmap#41

fire avatar Oct 19 '23 23:10 fire

This causes tangent generation to crash. Office_chair.glb.zip

Screenshot 2023-11-06 at 7 48 53 AM

fire avatar Nov 06 '23 15:11 fire

I pushed a rebase to master to see if that affects the mesh appearance.

Edited:

The flat areas now show correctly. Needs a performance retest.

fire avatar Nov 23 '23 23:11 fire

The flat areas now show correctly. Needs a performance retest.

The latest revision of this PR compared to master seems to be slower than previously, and still has visual differences compared to master:

Before After (previously) After (current)
59.5 ms 22.1 ms 28.9 ms
Before After (previously) After (current)
Screenshot_20231123_014843 webp Screenshot_20231123_014825 webp Screenshot_20231124_002528 webp

Calinou avatar Nov 23 '23 23:11 Calinou

Currently working on:

core/math/vector3.h:395:4: runtime error: division by zero
core/math/vector3.h:396:4: runtime error: division by zero
core/math/vector3.h:397:4: runtime error: division by zero
servers/rendering_server.cpp:649:24: runtime error: nan is outside the range of representable values of type 'short unsigned int'
thirdparty/misc/mikktspace.c:1889:21: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int'
thirdparty/misc/mikktspace.c:646:2: runtime error: -nan is outside the range of representable values of type 'int'
Segmentation fault (core dumped)
FATAL ERROR: Godot has been crashed.

fire avatar Nov 24 '23 10:11 fire

The handling of the w component in the SIMD code is a bit inconsistent, on SSE it should be ignored for all operations so denormal/NAN/INF should only cause performance regression. However, on NEON the w component is implicitly used by horizontal sum operations.

There seems to be a workaround for vadd() which zeroes the w component, but this feels a bit incomplete.

float32x4_t result = vaddq_f32(v1, v2);
return vsetq_lane_f32(0.0f, result, 3);

I'd remove that and make sure that w is always zero for all operations by:

  • Adding an assert in vload() to make sure that v->w == 0
  • vadd/vsub/vmul/vmin/vmax all preserve the zero without an issue
  • vscale/vdiv does not as they may introduce NAN values, so they should force the w component to zero with the above workaround

Alternatively you could let the w component be garbage and only make sure it's zero before passing the vector into vaddvq_f32().

Also I noticed that Normalize() is semantically different from the original in the new scalar and SSE code path: The original is specified as vscale(1 / Length(v), v) and currently scalar/SSE calculate it as v/Length(v). With the modifications above it could be changed back simply to:

static SVec3 Normalize( const SVec3 v )
{
    return vscale(1 / Length(v), v);
}

Regarding the visual issues, one way of verifying whether the issue is in vectorized code or somewhere else is to force ARCH_SSE to be 0 on an x64 platform, I personally don't have much time to work on this currently so that I could debug the issue. I think ARM would probably benefit from an abstracted define as well so it would be easy to toggle between the scalar and vector code path on that platform.

bqqbarbhg avatar Nov 24 '23 14:11 bqqbarbhg

Discussed splitting simd and callback fixes into two different pr.

fire avatar Dec 04 '23 21:12 fire

@akien-mga Is there a how-to generate the patch on third party code mikktspace?

fire avatar Dec 07 '23 16:12 fire

I have split the simd code to the mikktspace-optimization-simd branch.

fire avatar Dec 07 '23 16:12 fire

@akien-mga Is there a how-to generate the patch on third party code mikktspace?

Commit unmodified upstream code. Make changes. git diff > patches/name.patch Commit changes + patch (possibly as amend of the first commit if the unmodified upstream code doesn't make sense as an isolated commit).

akien-mga avatar Dec 08 '23 09:12 akien-mga

@Calinou I was trying to test performance, but I wasn't able to use test_generate_tangents_obj_import.zip because the normal debug mode is broken. Can you confirm and maybe run the performance test.

fire avatar Jan 17 '24 19:01 fire

On my end, normals have still regressed with the latest revision of this PR rebased on top of master https://github.com/godotengine/godot/commit/fa48a51183567934984b381ad8ec281cb24d66ba:

Before After
Screenshot_20231123_014843 webp image

Calinou avatar Jan 29 '24 13:01 Calinou

Thanks I wanted to confirm that normals are broken in this pr. I was suspecting something with the normals

fire avatar Jan 29 '24 21:01 fire

The normals should be fine now. Can you try if possible again?

fire avatar Feb 01 '24 17:02 fire

The normals should be fine now. Can you try if possible again?

Would it be possible to explain what change was made to the code to improve the normals? That would assist me in understanding the code. I'm having trouble locating the differences compared to the revision Calinou tested on.

lyuma avatar Feb 09 '24 14:02 lyuma

Here is the diff https://github.com/godotengine/godot/compare/5035d24d6f24d26d5c1d936deec8e56d7caf2628..7706099a14b2b52e17f04470a320d8e9bd99d59e

Note the part with the index.

fire avatar Feb 09 '24 15:02 fire

Here is the diff https://github.com/godotengine/godot/compare/5035d24d6f24d26d5c1d936deec8e56d7caf2628..7706099a14b2b52e17f04470a320d8e9bd99d59e

Note the part with the index.

@fire This diff appears to be backwards: The current version of the code seems to be on the left side of the diff for me. Can you confirm?

Also, these lines of code are incorrect:

	if (!(format & Mesh::ARRAY_FORMAT_NORMAL)) {
		index();
	}

The and should be against Mesh::ARRAY_FORMAT_INDEX, not ARRAY_FORMAT_NORMAL

lyuma avatar Feb 19 '24 03:02 lyuma

@lyuma I can't see any of the changes you're referring to in the current code.

fire avatar Feb 19 '24 08:02 fire

I am confused because Calinou stated that there was a regression. You then said to try again, despite no commits since then, and so I asked what had changed. I clicked the link you sent and saw the diff I commented on

but you claim that change you linked me evidently isn't there, so I am back to being confused about what changed and what fixed the normals. github does not show any commits since Calinou tested the normals.

Anyway none of that matters much at this point. What I really want to see is a good screenshot of the mesh Calinou is trying which you claim has been fixed. I think they attached it earlier.

and more broadly speaking, if it is true that the normals issue was fixed as you claimed in the earlier comment, what are the remaining blockers for getting this change merged?

lyuma avatar Feb 19 '24 11:02 lyuma

Is this ready?

adamscott avatar May 29 '24 14:05 adamscott

Let me run it through my test cases on master

fire avatar May 30 '24 02:05 fire

Ok, I am trying to follow this. I just want to confirm I should be working in the salvage/mikktspace-optimization branch from https://github.com/V-Sekai/godot/tree/salvage/mikktspace-optimization to salvage this change? For the change above, I am seeing the change to resource_importer.cpp is current. The change to surface_tool.cpp is not. Or, @lyuma comment holds true (the diff is backwards)?

Otherwise, I do see the changes to stuff in mikktspace. Oddly, it appears the normals look correct but I expected them to look wrong from the comments: image

Unsure what to make of this now.

Sluggernot avatar Jun 26 '24 13:06 Sluggernot

As far I know all the bugs are fixed, but I haven't been working on this for a while so it's salvagable and it's in the same areas.

fire avatar Jun 26 '24 16:06 fire