osu icon indicating copy to clipboard operation
osu copied to clipboard

Use span instead of string splitting in beatmap parsing

Open huoyaoyuan opened this issue 1 year ago • 7 comments

I've partially wrote a span-based version back in 2019. During the years, C# and BCL have improved and the syntax is simplified a lot. This PR eliminates most allocations from string.Split. There's still a lot of string allocations from StreamReader. I've also evaluated a prototype to manipulate the text in UTF-8 directly without converting the whole file into UTF-16. That would require a massive change around consuming. Since there are still some UTF-8 facilities like IUtf8SpanParsable coming in .NET 7 and 8, I think it's better to investigate the UTF-8 approach later. The work in this PR won't be abandoned since many span-related adjusts can still be used.

Performance Numbers

The improvement ratio of this PR is surprisingly hard to measure. Parsing on continuous memory is sensitive to memory alignment. Repeatedly running the same benchmark can results in 5% error or even more. I tried some approaches like memory randomization, but still can't get a more stable result. Thus, I abandoned some optimization like inline tuning since I can't see a consistent benefit. The following numbers are achieved on BenchmarkDotNet 0.13.5. Its results differs a lot with 0.13.4 (The current one in use).

BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22621.1555/22H2/2022Update/SunValley2)
13th Gen Intel Core i9-13900K, 1 CPU, 32 logical and 24 physical cores
.NET SDK=8.0.100-preview.3.23178.7
  [Host]     : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  Job-DAWWIJ : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  Job-KJBRBN : .NET 7.0.5 (7.0.523.17405), X64 RyuJIT AVX2
  Job-VRWABD : .NET 8.0.0 (8.0.23.17408), X64 RyuJIT AVX2

PR:

Method Runtime Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
BenchmarkBundledBeatmap .NET 6.0 1.859 ms 0.0202 ms 0.0189 ms 1.00 0.00 166.0156 56.6406 2.98 MB 1.00
BenchmarkBundledBeatmap .NET 7.0 1.612 ms 0.0177 ms 0.0148 ms 0.87 0.01 166.0156 152.3438 2.98 MB 1.00
BenchmarkBundledBeatmap .NET 8.0 1.553 ms 0.0246 ms 0.0230 ms 0.84 0.02 164.0625 146.4844 2.97 MB 1.00

master:

Method Runtime Mean Error StdDev Ratio Gen0 Gen1 Allocated Alloc Ratio
BenchmarkBundledBeatmap .NET 6.0 2.073 ms 0.0139 ms 0.0123 ms 1.00 207.0313 101.5625 3.78 MB 1.00
BenchmarkBundledBeatmap .NET 7.0 1.794 ms 0.0179 ms 0.0159 ms 0.87 208.9844 179.6875 3.78 MB 1.00
BenchmarkBundledBeatmap .NET 8.0 1.759 ms 0.0240 ms 0.0224 ms 0.85 208.9844 173.8281 3.77 MB 1.00

I also included later frameworks in benchmark to ensure the numbers aren't invalidated by framework optimizations. The improvement is 10%+ faster in speed (I've seen 17% improvement in certain round of benchmark) and 21% less allocation.

There are still other room of improvement, which I'll visit in other PRs. Here's a glance of allocation trace after this PR: image string are getting much less allocation ratio (previously top 2 of allocated types).

huoyaoyuan avatar Apr 22 '23 13:04 huoyaoyuan