Celeste64
Celeste64 copied to clipboard
Camera FieldOfView can be NaN
Error Log (1/29/2024 11:15:25 PM)
Call Stack:
System.ArgumentOutOfRangeException: fieldOfView ('NaN') must be greater than '0'. (Parameter 'fieldOfView')
Actual value was NaN.
at System.ArgumentOutOfRangeException.ThrowLessEqual[T](T, T, String)
at Celeste64.Camera.get_Projection() in /home/noel/Projects/Celeste64/Source/Graphics/Camera.cs:line 128
at Celeste64.World.Render(Target) in /home/noel/Projects/Celeste64/Source/Scenes/World.cs:line 633
at Celeste64.Game.Render() in /home/noel/Projects/Celeste64/Source/Game.cs:line 288
at Foster.Framework.App.Tick()
at Foster.Framework.App.Run(String, Int32, Int32, Boolean )
at Foster.Framework.App.Run[T](String, Int32, Int32, Boolean )
at Celeste64.Program.Main(String[]) in /home/noel/Projects/Celeste64/Source/Program.cs:line 23
Game Output:
Celeste 64 v.1.0.1
Foster: v0.1.14
Platform: Microsoft Windows 10.0.19045 (X64)
Framework: .NET 8.0.1
SDL: v2.28.5
OpenGL: v3.3.13399 Core Profile Forward-Compatible Context 15.201.1151.1008, AMD Radeon HD 6450
FMOD Bindings: v20218
FMOD: v20207
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.strings.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\music.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\sfx.bank
Loaded Assets in 1758ms
Strawb Count: 20
Loaded Map '1' in 268ms```
The only place FieldOfView is ever assigned is here in Player.cs
float targetFOV = Calc.ClampedMap(velocity.XY().Length(), MaxSpeed * 1.2f, 120, 1, 1.2f);
World.Camera.FOVMultiplier = Calc.Approach(World.Camera.FOVMultiplier, targetFOV, Time.Delta / 4);
The only way I could see NaN being assigned is if velocity is also somehow maybe NaN? Weird.
Gonna share some progress debugging this crash.
I managed to narrow down the NaN
appearance in one crash instance to this block of code in StNormalUpdate
:
https://github.com/ExOK/Celeste64/blob/48eebdadc8814451308eef2b96180edf6ac2b91d/Source/Actors/Player.cs#L990-L1120
velXY
is being set to NaN
somewhere here.
The person says they're crashing whenever they try to move around, but jumping works fine - so does moving the camera. Dashing causes the game to freeze, both grounded and airborne. They use keyboard, and not analog (I even completely commented out the AddLeftJoystick
and AddDPad
calls - it still crashes.)
Feels like Move.Value
is somehow NaN
?
Good news; Move.Value
does not contain NaN
. Building and running with the below patch does not log that Controls.Move.Value
has NaN
.
diff --git a/Source/Game.cs b/Source/Game.cs
index 89c0fe8..a5fb4f9 100644
--- a/Source/Game.cs
+++ b/Source/Game.cs
@@ -113,6 +113,9 @@ public class Game : Module
public override void Update()
{
+ if (VectorHelpers.HasNaN(Controls.Move.Value))
+ Log.Error($"{nameof(Controls.Move.Value)} contains NaN! Things are about to go very wrong! ({Controls.Move.Value})");
+
// update top scene
if (scenes.TryPeek(out var scene))
{
diff --git a/Source/Helpers/VectorHelpers.cs b/Source/Helpers/VectorHelpers.cs
new file mode 100644
index 0000000..173d09a
--- /dev/null
+++ b/Source/Helpers/VectorHelpers.cs
@@ -0,0 +1,9 @@
+namespace Celeste64;
+public static class VectorHelpers
+{
+ public static bool HasNaN(this in Vec2 vec2)
+ => float.IsNaN(vec2.X) || float.IsNaN(vec2.Y);
+
+ public static bool HasNaN(this in Vec3 vec3)
+ => float.IsNaN(vec3.X) || float.IsNaN(vec3.Y) || float.IsNaN(vec3.Z);
+}
Thanks for the investigation! This should narrow it down a lot ...
Found the source of the NaN
; apparently it's RelativeMoveInput
. (so I was almost there; I considered checking RelativeMoveInput
directly but chose against it, as I thought that since Controls.Move.Input
is fine, RelativeMoveInput
would also be fine.)
I added assertions to every parameter when calculating velXY
, logging if it contains NaN. input
is NaN
when grounded; RelativeMoveInput
(and subsequently accel
) is NaN
when airborne.
I'd take it home from there, but it's currently late; figured I'd share the development. I plan to make a PR in case I get to the bottom of this issue.
Somehow the NaN
surfaces from normalizing the camera's XY components?? This doesn't make any sense..
diff --git a/Source/Actors/Player.cs b/Source/Actors/Player.cs
index d0482d6..03950ae 100644
--- a/Source/Actors/Player.cs
+++ b/Source/Actors/Player.cs
@@ -611,7 +611,15 @@ public class Player : Actor, IHaveModels, IHaveSprites, IRidePlatforms, ICastPoi
if (Vec2.Dot(input, Vec2.UnitY) >= .985f)
input = Vec2.UnitY;
- return forward * input.Y + side * input.X;
+ Vec2 ret = forward * input.Y + side * input.X;
+ if (ret.HasNaN())
+ {
+ ret.AssertNotNaN($"{nameof(RelativeMoveInput)} contains NaN!");
+ Log.Error($"{nameof(World.Camera.Forward)} = {World.Camera.Forward}");
+ Log.Error($"{nameof(forward)} = {forward} (normalized)");
+ Log.Error($"{nameof(side)} = {side} (normalized)");
+ }
+ return ret;
}
}
Running this yields these logs:
RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-0.10875643, -0.887422, -0.44794446>
forward = <0.12164313, -Infinity> (normalized)
side = <Infinity, NaN> (normalized)
This is not a one-off, here's what happens when the camera is not moved:
RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-3.9004448E-08, 0.89231783, -0.45140773>
forward = <-4.3711385E-08, Infinity> (normalized)
side = <-Infinity, NaN> (normalized)
Hmm we do use a custom Normalize method, but all it does is check for x/y being 0 and return 0 in that case. Maybe we need to use an epsilon?
public static Vector2 Normalized(this Vector2 vector)
{
if (MathF.Abs(vector.X) <= float.Epsilon && MathF.Abs(vector.Y) <= float.Epsilon)
return Vector2.Zero;
return Vector2.Normalize(vector);
}
I'm not sure what else would be the cause... but that doesn't entirely make sense because the input values aren't both near zero (only X in the example above).
Getting infinite/NaN values is definitely expected for very small values (Vector2.Normalize(new(float.Epsilon, 0))
produces <∞, NaN>
for instance), but the values Snip's test subject is getting this for are definitely not in a range where that should happen (and it doesn't if I test with those values).
Note that even when this is behaving normally, float.Epsilon
is definitely too small as a guard to fully avoid this. In my testing this starts happening when the vector components are somewhere in the 1e-23 range (for reference, float.Epsilon is 1e-45).
Could it be processor-dependent? The person is running on a relatively old CPU (AMD Phenom II X4 955), and Vector2
operations' assembly leverages AVX instructions; this sounds kind of crazy but it's the only thing floating on my mind.
It might make sense as I've only seen two people experience this crash in the Celeste server.
Got bored and decided to make a visualization;
Yellow is input, green is expected and red is actual. (red vector's angle with the -Y axis is not to scale for demonstration sake)
if we do our own Normal calculation does that work for them? what the hell
return vector / MathF.Sqrt(vector.X * vector.X + vector.Y * vector.Y)
Will try tomorrow. :+1:
For those who are following this issue:
It turns out my theory wasn't in fact so crazy. A .NET Runtime bug causes the JIT to emit wrong ASM instructions when calling Vector2.Normalize(Vector2)
on CPUs which don't support AVX, making the normalization result incorrect. For example, on an AMD Phenom II X4 955, this causes the normalized vector's Y component to be +/- Infinity.
The solution here is to do a manual normalization.
I've forked Foster and made it fall back to manual normalization, if any component of the normalized vector is not finite.
I've also included some camera debug metrics in the top left if Celeste64 is built in debug mode.
I sent the built version to my tester (thank you Tanya!) and they confirmed it worked for them. I'll soon make a release on my fork including the fix.
EDIT: The release is out. I hope this'll help those affected until the .NET bug is fixed.
Wow, thanks for looking into this and figuring it out!
I guess at this point it's worth submitting findings to .NET runtime? We could add a hack to use manual normalization if it fails but I don't really want to do that on our end if it's something that can be resolved upstream... Unless .NET team decide it's not worth fixing for some reason.
@Popax21 said he'd make an issue soon, but not right now as he's currently pretty busy. I'd rather let him do the talking as I'm not exactly very qualified to be talking about this stuff.
It would probably be a good idea to figure out the minimum .NET version this bug was introduced in while I am still out of action. I'll try to prepare a writeup for a potential .NET bug report once I have the time for it.
This might be related to issues seen on ARM64 Release Mode: https://github.com/ExOK/Celeste64/issues/70#issuecomment-1933066532
The fork fell a bit out of date, so I pulled the changes and published the 1.1.1 release + a few extra commits; remembering to publish new platforms as well. (unfortunately I don't have any machines to test the releases on, so fingers crossed)
So this is CPU dependent? huh interesting(i have a phenom 2😭)
[...] For example, on an AMD Phenom X4 955, this causes the normalized vector's Y component to be +/- Infinity.
Whoops, I meant to say "AMD Phenom II X4 955", but just noticed I forgot to put the "II" in. Regardless, my Celeste64 fork should still make the game playable for you.
After spending a few hours trying to chase down this bug in the JIT source code, I found out that it's already known & fixed: https://github.com/dotnet/runtime/issues/96939.
Not sure if the fix is in the latest .NET 8 runtime, but it would definitely be worth a try to see if updating fixes this.
Yeah I think at this point it's worth it for me to just update everything and rebuild and see where we're at. I'll do that soon.