godot icon indicating copy to clipboard operation
godot copied to clipboard

Fix NaN populating ParticleProcessMaterial Transform

Open AtlaStar opened this issue 1 year ago • 4 comments

Use of normalize will cause transform matrix slices to be populated with NaN when the scale value was zero on the previous compute step as is shown in this screenshot from RenderDoc. This bug doesn't occur when particle_flag_disable_z is set regardless of other flags being set, and will not happen if particle_flag_rotate_y is set. So any use of normalize there shouldn't be fragile enough to break.

image

The above was before the compute step was ran, while after the buffer had those zero values replaced with NaN image

This shows that the only way to resolve this bug, is to not use normalize in specific areas. This PR resolves this bug by adding a normalize_or_else function. Parameters are _in, and _else, with the former being the argument to test to see if it is a zero vec3, and the latter the vec3 to use in the event of _in being the zero vec. Concept behind requiring a vec3 to be passed in is that some behaviors or flags may require a default other than what would be the corresponding slice of an identity matrix, although currently I am unaware if there is a necessity as the testing done suggested parity was retained with CPUParticles3D.

Just some visuals of one of the setups for the particles I was using image image image

closes #97680 closes #97621

AtlaStar avatar Oct 06 '24 08:10 AtlaStar

@firecatmagic figured that you'd be interested in an update, because if nothing else this presents a fix you can potentially start using now (all the stuff changed are things auto-gen'd into the particle process material shader)

AtlaStar avatar Oct 06 '24 08:10 AtlaStar

I had a GPUParticles3D which was one-shot, yet the "emitting" flag was like a dirty flag. So when the scene stopped, it was no longer emitting inside the godot editor, and I had to re-open the scene and enable the "emitting" flag, in order to launch the game with an emitting GPUParticles3D. I built this patch via https://docs.godotengine.org/en/stable/contributing/development/compiling/compiling_for_linuxbsd.html and it fixes this issue for me, despite not being completely the same problem as the OP's.

TheYellowArchitect avatar Oct 06 '24 22:10 TheYellowArchitect

Oh, nice @TheYellowArchitect! Did you have an issue open for that bug already? If so I can link that issue so it will get closed when this PR is merged.

AtlaStar avatar Oct 06 '24 22:10 AtlaStar

Nah, it was very similar to the linked issue #97680 (that is how I found this PR), so I didn't open an issue as it would be a duplicate. Here are the similarities of my GPUParticle3D compared to #97680 : Quad Mesh + one-shot + alpha color curve + scale curve

Nothing else related to that issue (no curve with initial value 0, or billboards)

Edit: If I wait for the one-shot to finish, so the flag "emitting" becomes false, it also becomes false in the editor, despite emitting in the editor view. So I will make a new issue for that and link it here :+1: Regardless, this issue fixed the bug where I stop the main scene with emitting GPUParticle3Ds and it stops emitting in the GPUParticles3D scene (dirty flag) With this PR, if I stop the main scene while emitting, GPUParticle3Ds continue emitting in their own scene properly.

TheYellowArchitect avatar Oct 06 '24 22:10 TheYellowArchitect

sigh, screwed up the rebase on this...gonna fix it...

AtlaStar avatar Oct 10 '24 23:10 AtlaStar

@FireCatMagic figured that you'd be interested in an update, because if nothing else this presents a fix you can potentially start using now (all the stuff changed are things auto-gen'd into the particle process material shader)

I didn't see this till now, this is great work!! Messing around with GPU particles this fixes a lot of what buggy behavior there was, thank you for this

FireCatMagic avatar Oct 13 '24 06:10 FireCatMagic

The physics interpolation has a way to avoid normalize you might want to investigate

fire avatar Oct 13 '24 21:10 fire

@fire I will look into it again then, but if it was the already existing safe_normalize testing showed that it didn't work and that setting the scaling params of the transform all to 0 would result in the transform never being set to a value other than 0 because in later steps the transform is not overwritten other than by the result of the normalization, and then used in a multiplicative operation. So the m11 minor being all 0s will result in it being 0 for the life of the particle, and never rendering.

The only solution I was able to find is specifically setting the transform values to the appropriate slice of an identity matrix, not a [0.0, 0.0, 0.0] vector.

AtlaStar avatar Oct 13 '24 23:10 AtlaStar

@fire yeah testing still shows safe_normalize will cause issues, at least as of Godot v4.4.dev (92e51fca7) image

This was also the result withbillboarding both on and off, so the keep scale setting doesn't even matter when using the safe_normalize function. Because the transform is being multiplied by the scale value in order to properly set the particle scale, setting things to zero when normalization would fail just results in the particles not drawing (unless you minimize and maximize, for some reason the particles in editor draws when restoring the screen? Maybe it is doing CPU drawing when minimized?)

AtlaStar avatar Oct 14 '24 00:10 AtlaStar

I don't see how the normalize function is to blame here.

But I don't understand this if statement in the code

if (length(final_velocity) > 0.0) {

Usually comparing floating points to precise values is not recommended because error can easily accumulate and make the check false.

So is it possible that it breaks for very small velocities like 1e-15? Would it be easier to solve this issue with a larger deadzone, like 0.00001 ?

lyuma avatar Oct 14 '24 03:10 lyuma

I don't see how the normalize function is to blame here.

But I don't understand this if statement in the code

if (length(final_velocity) > 0.0) {

Usually comparing floating points to precise values is not recommended because error can easily accumulate and make the check false.

So is it possible that it breaks for very small velocities like 1e-15? Would it be easier to solve this issue with a larger deadzone, like 0.00001 ?

Because the transform is filled with all 0's on the scale and rotation bits. Normalizing a vector is just dividing each term by it's length, so when the length is 0... you get NaN. If instead you use the safe version, you still have all 0's in there, which you then use as the basis to multiply the scaling values by, resulting in it forever remaining 0.

The issue this solves was only occurring if at some point in the particles lifetime scale was 0. Any other time, there is no issue.

AtlaStar avatar Oct 14 '24 03:10 AtlaStar

@lyuma also, to make it extra apparent in case it wasn't, the issue this causes is entirely caused by NaN being populated in the transform, and the two most common ways to produce NaN are adding/subtracting to infinity, or dividing 0 by 0. Normalize was basically doing (0 + 0 + 0)/0, hence NaN being produced by normalize.

The safe version when the row is 0 just keeps it 0, which in this specific case is actually wrong because of the later multiplication done in the compute shader.

AtlaStar avatar Oct 14 '24 03:10 AtlaStar

I believe at a certain point I had a fix for this that was max(scale, 0.00001). I wonder why it isn't working anymore.

QbieShay avatar Oct 17 '24 15:10 QbieShay

@QbieShay When the scaling factor is 0, the sign of it is also 0. Personally I think that having 0 sized particles should be possible, which is solved by replacing the transformation when it is all 0 to an identity prior to multiplying it by the scaling values at the end of the shader.

AtlaStar avatar Oct 17 '24 16:10 AtlaStar

I don't think there's a significant difference between 0.0 and 0.000001, and safe normalizing things costs performance

QbieShay avatar Oct 17 '24 17:10 QbieShay

@QbieShay the issue with that is again that this scenario pops up when using a curve texture primarily. So you go from having to do some form of safe normalize to having to clamp individual color channels from values that are zero to some minimal amount, meaning that you'd have to do the logic you mentioned at the site of the texture read...and the problem of figuring out the sign still persists.

If negative scaling values weren't supported, this would be very simple...but because they are you can't just multiply things by the sign of the value, you have to branch on the sign being 0 to replace it with a small but non-zero value...and if a branch is required some form of safe normalization makes more sense imo.

AtlaStar avatar Oct 17 '24 17:10 AtlaStar

Sorry, i should have taken the time to read your code properly before commeting. I had assumed you're calculating length but you aren't.

QbieShay avatar Oct 17 '24 18:10 QbieShay

It is all good. In all reality this wouldn't be an issue if glsl had a signbit operator because you could just replace the use of sign with signbit

AtlaStar avatar Oct 17 '24 18:10 AtlaStar

Ya know on saying the above out loud so to speak, I realize that writing a signbit operator probably would be more performant than this solution @QbieShay

AtlaStar avatar Oct 17 '24 18:10 AtlaStar

Thanks!

Repiteo avatar Oct 28 '25 17:10 Repiteo