cesium-unreal
cesium-unreal copied to clipboard
Fix Chaos physics epsilon silliness by scaling meshes up and then down.
Fixes #1250
Lending further support to my theory that "anytime a software developer is typing the word 'epsilon', they're typing a bug"... (See my previous rant about epsilons in CesiumGS/cesium-native#914)
We've had trouble with Chaos physics ever since Unreal Engine switched to it by default. The first problem was that Chaos would frequently crash in debug builds, or hang and spam the log in release builds, when doing collisions against many of our meshes (#1084). It was caused by checking for (and logging) "degenerate triangles". So, we decided to filter out degenerate triangles from our meshes using the same test that Chaos itself uses (#1106).
That worked well in that it stopped the crashing and freezing. But we still occasionally heard reports (#1250) of failed lined traces against our meshes. We assumed this was because the triangles it was supposed to hit were quite small, and so Chaos is (incorrectly) flagging them as degenerate.
But with the San Francisco Ferry Building model, the triangles being missed really aren't all that small. What is going on here? I finally spent a chunk of time looking at it today.
My diagnosis is that the degenerate triangle test that Chaos is using is astoundingly bad. It looks like this:
const ParticleVecType& A = MParticles.X(Elements[FaceIdx][0]);
const ParticleVecType& B = MParticles.X(Elements[FaceIdx][1]);
const ParticleVecType& C = MParticles.X(Elements[FaceIdx][2]);
const ParticleVecType AB = B - A;
const ParticleVecType AC = C - A;
ParticleVecType Normal = ParticleVecType::CrossProduct(AB, AC);
if(Normal.SafeNormalize() < UE_SMALL_NUMBER)
{
UE_LOG(LogChaos, Warning, TEXT("Degenerate triangle %d: (%f %f %f) (%f %f %f) (%f %f %f)"), FaceIdx, A.X, A.Y, A.Z, B.X, B.Y, B.Z, C.X, C.Y, C.Z);
CHAOS_ENSURE(false);
return FVec3(0, 0, 1);
}
A, B, and C are the three vertex positions and UE_SMALL_NUMBER is 1e-8. And if you dig into that SafeNormalize, you'll see there's yet another epsilon test. If the squared magnitude of the vector is less then 1e-4, then SafeNormalize will return 0.0. Zero is definitely less than 1e-8, so this warning will be triggered.
This computation is done in model coordinates. Which means.... wait for it... the quantities here do not have any known units (it depends on how the model is scaled). So here we go again applying an absolute epsilon to quantities of unknown magnitude. 🤦 Which is guaranteed to end in heartbreak.
Let's say our model coordinates are meters, and we have a simple right triangle that is 9cm on a side. The Normal will have a magnitude of 0.09 * 0.09 = 0.0081 meters. So the magnitude squared of that vector will be 0.00006561 m^2. That is clearly less than 1e-4, therefore SafeNormalize will return 0.0 and Chaos will log a "degenerate triangle" warning about this triangle! For a 9cm per side right triangle!
Maybe the author of this code didn't realize that SafeNormalize has a 1e-4 epsilon check on the magnitude squared? Probably not, because I don't think the UE_SMALL_NUMBER check will ever make sense with that in place. I mean, to be fair, it's absolutely insane that SafeNormalize - a function working with double-precision values! - returns zero when the magnitude squared is less then 1e-4.
I guess maybe UE can get away with this when model coordinates match world coordinates, and so the units are centimeters instead of meters. A 9cm triangle being degenerate is a big problem, but a 0.09cm triangle being degenerate is I guess kinda sorta more acceptable.
But it's worth noting that there are a bunch of other peope complaining about this problem; it's not just us: https://forums.unrealengine.com/t/degenerate-triangle-and-tvector-ensure-crashes/544465
Ok, so what do we do about it? Short of changing UE itself, I can only think of one solution: change the scale of our mesh. So that's what this PR does. It multiplies every vertex position in our meshes by 1024.0 (chosen so that only the exponent of the floating point number is affected, not the mantissa... I think I did that right?). And then adjusts the UCesiumGltfPrimitiveComponent matrix to scale by 1 / 1024.0 to bring the model back down to the correct size. With the mesh scaled up by 1024, the silly absolute epsilon becomes less silly. It means our right triangle would have to be less 0.1mm to be considered degenerate. I tried to find a way to scale only the physics mesh, but it didn't seem possible.
This PR is a draft because it's not super well tested and needs some cleanup, but it seems to work very well with the basis tests I've done so far.
Very interesting find!
Seems to work well with the tilesets that were causing us collision problems in the past. Also seems to not have any performance repercussions as far as I as a user can see.
Seems to work well with the tilesets that were causing us collision problems in the past. Also seems to not have any performance repercussions as far as I as a user can see.
Good to hear. Thanks for trying it!
Something similar may be happening in Cesium for Unity as well: https://community.cesium.com/t/errors-in-playmode-after-updateing-to-unity-6/31892/7
This seems to be working well and is ready for review.
@csciguy8 can I bug you to review this one today, since Janine and Ashley are both out?
You got it!
Thanks for the review @csciguy8! I've removed the unused function and added a comment about the scale factor being a power of two so that it only affects the exponent.
I decided to leave in the unnecessary multiplications of zero by the scaling factor, though. I think the performance impact is irrelevant (especially because it's only in test code), and there's a bit of clarity to be gained by leaving it in. Particularly if someone in the future copies a test and modifies the vector value, there's less possibility of error if they don't also have to remember to add back in the multiplication by the scaling factor. Or, looking at it another way, the multiplication by the scaling factor in the tests is useful documentation that the scaling is needed in general, even if it happens to not be needed for some of the particular values used. But let me know if you feel strongly about it.
I'll merge this once CI passes.
But let me know if you feel strongly about it.
I have no strong feelings, all good.
I find a bug,in CesiumGltfComponent.cpp: instanceTransforms[i] = glm::translate( instanceTransforms[i], translation); should be: instanceTransforms[i] = glm::translate( instanceTransforms[i], translation * CesiumPrimitiveData::positionScaleFactor);
Thank you for bringing this to our attention @WxpGiser ! Would you be able to open a PR with the changes described? That would allow us to merge into main ASAP. :pray:
I can't create new branch ,you just do it.
@WxpGiser We definitely want to do it, but we are cautious about copying code due to legal reasons. If you're unable to contribute a PR, could you please sign a CLA so we can get this in ASAP?
Cesium Individual Contributor License Agreement ("Agreement") v1.0 Thank you so much! If you have a contribution pending review on GitHub, please let the reviewer know that you've signed the CLA.
Thanks @WxpGiser ! I just opened #1499 with this fix. :smile: