osu
osu copied to clipboard
Use span instead of string splitting in beatmap parsing
I've partially wrote a span-based version back in 2019. During the years, C# and BCL have improved and the syntax is simplified a lot.
This PR eliminates most allocations from string.Split
. There's still a lot of string allocations from StreamReader
. I've also evaluated a prototype to manipulate the text in UTF-8 directly without converting the whole file into UTF-16. That would require a massive change around consuming. Since there are still some UTF-8 facilities like IUtf8SpanParsable
coming in .NET 7 and 8, I think it's better to investigate the UTF-8 approach later. The work in this PR won't be abandoned since many span-related adjusts can still be used.
Performance Numbers
The improvement ratio of this PR is surprisingly hard to measure. Parsing on continuous memory is sensitive to memory alignment. Repeatedly running the same benchmark can results in 5% error or even more. I tried some approaches like memory randomization, but still can't get a more stable result. Thus, I abandoned some optimization like inline tuning since I can't see a consistent benefit. The following numbers are achieved on BenchmarkDotNet 0.13.5. Its results differs a lot with 0.13.4 (The current one in use).
BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22621.1555/22H2/2022Update/SunValley2)
13th Gen Intel Core i9-13900K, 1 CPU, 32 logical and 24 physical cores
.NET SDK=8.0.100-preview.3.23178.7
[Host] : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
Job-DAWWIJ : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
Job-KJBRBN : .NET 7.0.5 (7.0.523.17405), X64 RyuJIT AVX2
Job-VRWABD : .NET 8.0.0 (8.0.23.17408), X64 RyuJIT AVX2
PR:
Method | Runtime | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|
BenchmarkBundledBeatmap | .NET 6.0 | 1.859 ms | 0.0202 ms | 0.0189 ms | 1.00 | 0.00 | 166.0156 | 56.6406 | 2.98 MB | 1.00 |
BenchmarkBundledBeatmap | .NET 7.0 | 1.612 ms | 0.0177 ms | 0.0148 ms | 0.87 | 0.01 | 166.0156 | 152.3438 | 2.98 MB | 1.00 |
BenchmarkBundledBeatmap | .NET 8.0 | 1.553 ms | 0.0246 ms | 0.0230 ms | 0.84 | 0.02 | 164.0625 | 146.4844 | 2.97 MB | 1.00 |
master:
Method | Runtime | Mean | Error | StdDev | Ratio | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
BenchmarkBundledBeatmap | .NET 6.0 | 2.073 ms | 0.0139 ms | 0.0123 ms | 1.00 | 207.0313 | 101.5625 | 3.78 MB | 1.00 |
BenchmarkBundledBeatmap | .NET 7.0 | 1.794 ms | 0.0179 ms | 0.0159 ms | 0.87 | 208.9844 | 179.6875 | 3.78 MB | 1.00 |
BenchmarkBundledBeatmap | .NET 8.0 | 1.759 ms | 0.0240 ms | 0.0224 ms | 0.85 | 208.9844 | 173.8281 | 3.77 MB | 1.00 |
I also included later frameworks in benchmark to ensure the numbers aren't invalidated by framework optimizations. The improvement is 10%+ faster in speed (I've seen 17% improvement in certain round of benchmark) and 21% less allocation.
There are still other room of improvement, which I'll visit in other PRs. Here's a glance of allocation trace after this PR:
string
are getting much less allocation ratio (previously top 2 of allocated types).
It looks like InspectCode version is too low for C# 11, which is necessary.
More importantly decoding tests are failing.
OK it's a well known pitfall - do not compare span and string with ==
.
Could you please commit the benchmark?
Should I update InspectCode in this PR? The update to BenchmarkDotNet is just package version, should it be included?
Please do the inspectcode update in a separate pull unless you are sure (and have tested) that no new inspections will fall out.
Also there are new merge conflicts here.
Glad to see .NET 8 being adopted.
The parsing can be simplified more with .NET 8 and C# 12 feature. I'll revisit this, probably this month.