Use nanosleep for non-Windows platforms
- Supersedes https://github.com/ppy/osu-framework/pull/6391
This PR lets Unix platforms use nanosleep instead of Thread.Sleep to improve on frame pacing. I will test on Android after some time, but I don't have any Apple devices to test this on.
I put UnixNativeSleep in Platform/Linux/Native, but I don't think it's the best fit... Do I need to create a new folder, or is it good as is?
Testing:
- [x] Windows
- [ ] macOS
- [x] Linux
- [ ] iOS
- [ ] Android
So do we just try using SDL_DelayNS again? I would prefer using SDL_DelayNS for Unix platforms as targeting different libc/arch in C# doesn't sound good for me, but I also agree that using an SDL function in Clock doesn't match well either (might be good enough with the comment as Susko3 mentioned).
Where's the testing/benchmarking at? I'd want to see some kind of proof this is beneficial before blindly implementing.
I'd hope this is done by the implementor as to not take time away from the core team.
You can also see that update thread has (almost) constant 1.2ms frame time on current release, while this implementation keeps it at 1ms. These were taken on Linux, and YMMV as not every distro has same configuration, but it works pretty well on my environment.
| Current release | This PR |
|---|---|
So do we just try using SDL_DelayNS again?
If this diff can't work then I'd rather have nothing than sdl in threading code paths.
You can also see that update thread has (almost) constant 1.2ms frame time on current release, while this implementation keeps it at 1ms. These were taken on Linux, and YMMV as not every distro has same configuration, but it works pretty well on my environment.
Current release This PR
![]()
I would prefer if this was benchmarked properly using BenchmarkDotNet to provide reproducible empirical results if there's any benefits. These screenshots won't do.
0.5ms duration:
| Method | Mean | Error | StdDev | Allocated |
|---|---|---|---|---|
| TestThreadSleep | 629.9 ns | 7.74 ns | 6.86 ns | - |
| TestNativeSleep | 503,293.3 ns | 26.55 ns | 23.54 ns | - |
1ms duration:
| Method | Mean | Error | StdDev | Allocated |
|---|---|---|---|---|
| TestThreadSleep | 1.054 ms | 0.0001 ms | 0.0001 ms | - |
| TestNativeSleep | 1.004 ms | 0.0000 ms | 0.0000 ms | - |
1.5ms duration:
| Method | Mean | Error | StdDev | Allocated |
|---|---|---|---|---|
| TestThreadSleep | 1.054 ms | 0.0002 ms | 0.0002 ms | - |
| TestNativeSleep | 1.503 ms | 0.0001 ms | 0.0001 ms | - |
I think Thread.Sleep is limited to millisecond precision even if TimeSpan is given.
Also, I assumed screenshots were enough to show near 0.00ms frame time stddev, but I agree that this is more clear.
Additional frametime graphs
- This PR is essentially https://github.com/ppy/osu-framework/pull/5501 but for all other systems. The benefits listed there should apply to this PR.
- I scaled
visible_ms_rangedown inFrameStatisticsDisplay.csto highlight the decrease in jitter as a smoother graph.
Windows 11
| Current release | This PR |
|---|---|
Test refactored Windows logic. Change is within margin of error.
Linux (Fedora 41)
| Current release | This PR |
|---|---|
The threads are sleeping for a more accurate amount of time leading to less gaps in the frametime graph and a smoother graph. The Thread.Sleep() method currently used is limited to integer millisecond level sleeping (as shown in the benchmark above) and a 1kHz loop requires sub millisecond timing. I observed no CPU utilization difference for this PR.
macOS
| Current release | This PR |
|---|---|
I had a limited amount of time with a MacBook so that's why these graphs are more rough looking than the other two. Even though the graphs look different, the graph for this PR looks smoother overall.
iOS
| Current release | This PR |
|---|---|
Seems to work as expected in lowering the jitter.
Is there a reason why there's (what looks to be) gaps in the old graphs?
Is there a reason why there's (what looks to be) gaps in the old graphs?
The gaps are an artifact of trying to sleep for less than a millisecond returning a tiny value. Applying this patch makes very small sleep times yellow and increases the height to be more visible.
diff --git a/osu.Framework/Graphics/Performance/FrameStatisticsDisplay.cs b/osu.Framework/Graphics/Performance/FrameStatisticsDisplay.cs
index f752c88c5..403138d70 100644
--- a/osu.Framework/Graphics/Performance/FrameStatisticsDisplay.cs
+++ b/osu.Framework/Graphics/Performance/FrameStatisticsDisplay.cs
@@ -38,7 +38,7 @@ internal partial class FrameStatisticsDisplay : Container, IStateful<FrameStatis
private const int amount_count_steps = 5;
private const int amount_ms_steps = 5;
- private const float visible_ms_range = 20;
+ private const float visible_ms_range = 0.1f;
private const float scale = HEIGHT / visible_ms_range;
private const float alpha_when_active = 0.75f;
@@ -490,6 +490,12 @@ private int addArea(FrameStatistics frame, PerformanceCollectionType? frameTimeT
Color4 col = frameTimeType.HasValue ? getColour(frameTimeType.Value) : new Color4(0.1f, 0.1f, 0.1f, 1);
+ if (frameTimeType.GetValueOrDefault() == PerformanceCollectionType.Sleep && drawHeight < 10)
+ {
+ col = Color4.Yellow;
+ drawHeight += HEIGHT / 2;
+ }
+
for (int i = currentHeight - 1; i >= 0; --i)
{
if (drawHeight-- == 0) break;
Linux
master (with above patch applied) |
This PR (with above patch applied) |
|---|---|
Right, so it's showing precisely what it should be based on how it was sleeping. All good.
Interesting. Consider reporting to https://github.com/dotnet/runtime/issues to improve the performance of Thread.Sleep. You may also get feedback around reliability/compatibility there.
I finally got around to writing the issue in dotnet/runtime, but it seems that there was already a similar discussion going on there. I guess we can keep our implementation for now?
Due to all of that, this isn't something I can see getting prioritized for .NET 10. However, there is nothing preventing a 3rd party library from being created that provides the functionality in the meantime.
