osu-framework
osu-framework copied to clipboard
Add option to neglect non-space characters such as diacritics to SearchContainer
This pull request addresses issue https://github.com/ppy/osu/issues/19591.
With this implementation diacritics are neglected altogether if NonSpaceCharacterMatching is set to false.
I think almost always diacritics should be neglected but maybe there are cases when it matters.
I do wonder what kind of a performance impact this (presumably) more expensive comparison has.
As a safer alternative IgnoreNonSpace can be by default set to false and changed to true only if needed.
I think benchmarking would be a good idea. See the existing benchmarks in the osu.Framework.Benchmarks project.
I have some difficulties making SearchContainer work in benchmark environment.
So instead I copy pasted checkTerm function and worked with it for now.
Searching 1000 random 20 letter strings in 1000 letter string gave results:
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.400
[Host] : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT DEBUG
DefaultJob : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT
| Method | Mean | Error | StdDev | Allocated |
|---|---|---|---|---|
| SearchIgnoreNonSpace | 62.268 ms | 2.0146 ms | 5.8126 ms | 60 B |
| SearchWithNonSpace | 5.590 ms | 0.1116 ms | 0.2736 ms | 4 B |
| SearchNonContiguousIgnoreNonSpace | 83.602 ms | 2.0619 ms | 5.9819 ms | 80 B |
| SearchNonContiguousWithNonSpace | 5.254 ms | 0.1006 ms | 0.2100 ms | 4 B |
Difference is rather high but the question is, is it acceptable difference.
I can try making SearchContainer work but I don't know if it is worth.
I think it would be beneficial to have it benchmarked in context, using SearchContainer. Using GameBenchmark and benchmarking RunSingleFrame() should work (look at existing benchmarks).
You can use Params to set IgnoreNonSpace, etc.
If you're unable to make SearchContainer work, then make checkTerm internal and use that in the benchmark instead of copying code. (Commit the benchmark to this PR as well.)
Thanks for suggestion, I guess I was on wrong track last time.
Now I get these results:
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.400
[Host] : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT DEBUG
DefaultJob : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT
| Method | AllowNonContiguous | IgnoreNonSpace | Mean | Error | StdDev | Median | Allocated |
|---|---|---|---|---|---|---|---|
| SearchRandomStrings | False | False | 84.42 μs | 4.811 μs | 13.648 μs | 80.80 μs | 2 KB |
| SearchRandomStrings | False | True | 92.62 μs | 7.832 μs | 22.470 μs | 82.90 μs | 2 KB |
| SearchRandomStrings | True | False | 81.29 μs | 3.510 μs | 9.900 μs | 77.50 μs | 2 KB |
| SearchRandomStrings | True | True | 83.45 μs | 4.696 μs | 13.474 μs | 78.30 μs | 2 KB |
These results were strange to me but I checked whether Update() and checkTerm() are being used and they were.
However, critical code checking is needed if I did something wrong.