godot icon indicating copy to clipboard operation
godot copied to clipboard

Fix crash on powerVR GPUs when a cached shader wasn't loaded in properly

Open MileyHollenberg opened this issue 1 year ago • 3 comments

This PR fixes the issue mentioned in #93406 and likely some other issues as well.

The root cause was a faulty shader cache on some very specific and low end GPUs, for me this was on my Oppo a57s with a PowerVR Rogue GE8320 GPU. This only seemed to occur when you had any Skeleton3D mesh on-screen and only on the second launch as then it would use the shader cache.

The fix is to simply call glLinkProgram() right after loading in the shader from the binary cache to force it to re-link everything and make glGetProgramiv be able to retrieve the actual linking status. This now catches the error before the shader is used and thus tells Godot to re-compile it rather than move forward and crash.

MileyHollenberg avatar Jul 27 '24 13:07 MileyHollenberg

Thanks for the contribution! The rendering team has been assigned to review.

For the record, your commit seems not to be linked to your GitHub account. See: Why are my commits linked to the wrong user? for more info.

akien-mga avatar Jul 27 '24 14:07 akien-mga

Ah, I'll have a look later today to see how to fix that, only just generated a new SSH key for this commit (I don't actually use Github for my personal projects but rather a self hosted git server)

MileyHollenberg avatar Jul 27 '24 14:07 MileyHollenberg

Fixed, thanks for the heads up :)

MileyHollenberg avatar Jul 27 '24 15:07 MileyHollenberg

Per the spec, glLinkProgram() and glProgramBinary() are mutually exclusive. On the one hand, if calling glLinkProgram() helps on some drivers, it's still good to do that, adding a comment about it. On the other hand, it'd be good to be sure this workaround is not breaking things on other drivers. An option would be to restrict the fix to certain GPUs or platforms.

RandomShaper avatar Jul 29 '24 10:07 RandomShaper

An option would be to restrict the fix to certain GPUs or platforms.

Example OpenGL - GPU-Check for Adreno 3xx workaround.

only GL_RENDERER = PowerVR Rogue GE8320 maybe

PlayStore Device Catalog - PowerVR GPU devices


GitHub keywords

@MileyHollenberg: please try in one line 😃

Fixes #93406

and not

This PR fixes the issue mentioned in #93406

After that, your PR should be linked correctly everywhere

image

Alex2782 avatar Jul 29 '24 13:07 Alex2782

Not quite sure what you mean about the oneliner change, do you just mean the first line of the description?

And if we're going to make it only for certain devices, would it make sense to turn this into a editor option similar to the shader cache IIRC (on mobile atm so I can't double check this but I know there's a different device specific option in the editor settings window somewhere). That way it will be easier for people to manually enable this for their device if it has the same issue rather than waiting for a long for a fix to be public.

I would definitely need some help on that though as I barely know my way around Godot's source code, this was literally my first look at it :)

MileyHollenberg avatar Jul 29 '24 15:07 MileyHollenberg

Looking at the 3.x source I can see that we do have a path where we recompile the shader from scratch (and evict it from the cache) if the link fails. I suspect on the GE8320 this path resulted in the shaders always recompiling from scratch. What I don't understand is why the link status was correct in 3.x, but is presumably incorrect here.

Right now we know that only the GE8320 is running into this particular issue. It is ovewhelmingly likely that this is a driver bug specific to that device only. So I wouldn't try to future proof a solution assuming that other GPU models will require the same solution.

I also tested this PR on my laptop and can confirm that the call to glLinkProgram() always fails so link_status is always GL_FALSE and the shader is always recompiled. Effectively, this change makes it so no device will benefit from the shader cache.

I can see two possible solutions:

  1. Disable shader caching on the GE8320. I suspect that it just doesn't work anyway.
  2. Try to identify which shaders are failing and only disable caching for those shaders on GE8320. It sounds like maybe only shaders with transform feedback are causing issues, so we could just disable caching shaders with transform feedback when using the GE8320.

clayjohn avatar Jul 29 '24 18:07 clayjohn

Fixes #93406: do you just mean the first line of the description?

Yes, I found a description here: github docs I don't know exactly how it works. But keyword fixes with words in between up to #number don't seem to work.


That way it will be easier for people to manually enable this for their device if it has the same issue rather than waiting for a long for a fix to be public.

I'm not sure if more configurations would be useful, after 1 year I still don't know all of them. If necessary, then somewhere in the project settings (screenshot), where I see a list of GPUs, for example

Screenshot

image

It's about faulty shader cache? There are already settings to disable the cache completely. Regular Godot users do not always have the ability to debug Android devices and search for crashes in the logcat.

Alex2782 avatar Jul 29 '24 18:07 Alex2782

Fair enough on the future proofing, I'll have a look at adding that specific GPU check tomorrow. Don't think it makes sense to do it just for the skeleton one as when someone creates a custom shader with the transform feedback as well it would likely have the same issue (unless there's an easy way to check for this)

And just updated the description, let me know if this is alright now.

MileyHollenberg avatar Jul 29 '24 18:07 MileyHollenberg

Looks good!

Screenshot

image

Alex2782 avatar Jul 29 '24 18:07 Alex2782

Fair enough on the future proofing, I'll have a look at adding that specific GPU check tomorrow. Don't think it makes sense to do it just for the skeleton one as when someone creates a custom shader with the transform feedback as well it would likely have the same issue (unless there's an easy way to check for this)

I have prototyped a quick fix here: https://github.com/clayjohn/godot/commit/3f6555397259990e92e5f8cc765a1e9bfb7a58d3

This should be all that is needed. I can't reproduce the issue using Firebase test labs, so please let me know if this works for you.

clayjohn avatar Jul 29 '24 19:07 clayjohn

@clayjohn Thanks, that works :). Updated the PR and squashed the commits so the addition and removal of the glLinkProgram aren't in the history (no need to keep that part around)

MileyHollenberg avatar Jul 30 '24 06:07 MileyHollenberg

~~Hmm, I may have made a little mistake when trying to rebase to master, fixing now~~

Should be all good now

MileyHollenberg avatar Jul 30 '24 06:07 MileyHollenberg

I don't know if this info would be helpful or not but some mesh that doesn't have no skeleton (or at least skeleton node) also crashes. I think it's called rigged mesh or something like that don't really know much but can provide a glb if needed

shahriarlabib000 avatar Jul 30 '24 13:07 shahriarlabib000

Interesting, please attach a GLB file yeah, I can have a look tonight

MileyHollenberg avatar Jul 30 '24 13:07 MileyHollenberg

Rekorder.zip

This uses blend shapes, which also uses transform feedback (its the same as skeleton essentially)

clayjohn avatar Jul 30 '24 16:07 clayjohn

Ohh thanks

shahriarlabib000 avatar Jul 30 '24 16:07 shahriarlabib000

@shahriarlabib000 Just tested your model and can confirm that it works without crashing :)

MileyHollenberg avatar Jul 30 '24 18:07 MileyHollenberg

Thanks! And congrats for your first merged Godot contribution :tada:

akien-mga avatar Aug 16 '24 08:08 akien-mga

Cherry-picked for 4.3.1.

akien-mga avatar Sep 16 '24 14:09 akien-mga