crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Use simpler faster Rabin-Karp-like search for short needle

Open funny-falcon opened this issue 2 years ago • 13 comments

funny-falcon avatar Sep 20 '23 00:09 funny-falcon

I see some overlap with the implementations of #index(Char)/#byte_index(Char) which both already handle bytesizes smaller than 4. I figure those use cases could be merged with these optimized algorithms?

straight-shoota avatar Sep 20 '23 10:09 straight-shoota

This and #13819 could definitely share the same internal method, now that that PR is merged. The only caveat is that here the 4 bytes might represent more than 1 character

HertzDevil avatar Sep 23 '23 13:09 HertzDevil

I doubd it is feasible to refactor "fast RK" into separate method:

  • character position is counted in index(String), and it is not needed to be counted in byteindex(String)
  • index(Char) already uses per-char iteration which doesn't use Rabin-Karp. Does it worth to change it to RK?
  • at most byteindex(Char) and byteindex(String) could be unified for short strings, I believe.

funny-falcon avatar Oct 01 '23 10:10 funny-falcon

I suppose it's best to merge this as is then. We can try to refactor and deduplicate once it's in.

straight-shoota avatar Oct 05 '23 09:10 straight-shoota

Do you have any benchmarks to indicate the performance impact of this change?

Why did you chose UInt32 for the short needle? I figure we could use UInt64 (or UInt128) for search strings with up to 8 (16) bytes? There should be no performance degradation at least for 64-bit arithmetics on 64-bit architectures.

Bigger search strings would need SIMD support though (ref #3057).

straight-shoota avatar Oct 05 '23 09:10 straight-shoota

@straight-shoota good question about integer size.

If 32bit platforms are not first-class targets for Crystal, then I'd like to use UInt64 certainly.

UInt128 doubtfully would be good cause it is not native.

For benchmarks: I used to bench RK in C, and saw impact of double multiplication. But ok, I will do benchmark for this PR.

funny-falcon avatar Oct 06 '23 06:10 funny-falcon

32-bit targets are certainly getting less important. This is an optimization and I expect even with a 64-bit hash performance would still be reasonable on 32-bit architectures. So I think there's no reason not to go for 64-bits.

straight-shoota avatar Oct 06 '23 08:10 straight-shoota

I've pushed refactoring commit, and benchmarked.

I use UInt32 for chars and UInt64 for strings in "fast rabin-karp".

It is not fastest variant: using UInt32 for small strings and UInt64 for 5-8 charracter strings were faster, but then performance for 2-4 byte string and 5-8 byte string were the same. I couldn't explain it, so I just drop UInt32 for 2-4 byte strings.

Really, it was nightmare benchmarking: simple changes could affect result upto 16% without any meaningful reason :cry:

Benchmark results on Ryzen 7 5825U at 2GHz with this scripts

was idx now idx diff idx was bidx now bidx diff bidx
small 1 0.068663 0.061373 -10.62 0.071392 0.008054 -88.72
medium 1 0.503822 0.334602 -33.59 0.545443 0.010590 -98.06
big 1 4.697361 2.865704 -38.99 5.058999 0.027877 -99.45
small 2 0.069861 0.053189 -23.86 0.073254 0.034132 -53.41
medium 2 0.509033 0.336184 -33.96 0.548318 0.235357 -57.08
big 2 4.730817 3.117069 -34.11 5.062775 2.049732 -59.51
small 4 0.077225 0.058108 -24.75 0.079227 0.036833 -53.51
medium 4 0.510799 0.340637 -33.31 0.554923 0.242836 -56.24
big 4 4.702089 3.096297 -34.15 5.069510 2.052939 -59.50
small 6 0.078827 0.057426 -27.15 0.080145 0.038325 -52.18
medium 6 0.511529 0.339824 -33.57 0.559667 0.241253 -56.89
big 6 4.700362 3.093604 -34.18 5.076374 2.053248 -59.55
small 10 0.088149 0.077928 -11.60 0.084642 0.070408 -16.82
medium 10 0.517613 0.511492 -1.18 0.568815 0.555601 -2.32
big 10 4.699969 4.695120 -0.10 5.085927 5.074980 -0.22
small 14 0.089366 0.080371 -10.07 0.089447 0.072904 -18.49
medium 14 0.523864 0.511200 -2.42 0.575362 0.556926 -3.20
big 14 4.706731 4.701650 -0.11 5.092217 5.071227 -0.41
small 18 0.095913 0.082964 -13.50 0.098844 0.078447 -20.64
medium 18 0.549657 0.531382 -3.32 0.585257 0.561600 -4.04
big 18 4.744925 4.725939 -0.40 5.100282 5.076517 -0.47

funny-falcon avatar Oct 08 '23 14:10 funny-falcon

UInt128 didn't show difference with full Rabin-Karp, so I don't include it into final variant.

funny-falcon avatar Oct 08 '23 14:10 funny-falcon

May I rebase and squash commits?

funny-falcon avatar Oct 14 '23 07:10 funny-falcon

No, merge as usual

HertzDevil avatar Oct 14 '23 07:10 HertzDevil

@HertzDevil it’s pity: a lot of small commits without much sense. Personally I prefer cleaner history, so I tend to rearrange commits after review before merge. This way it goes at work, and in FOSS projects which lives outside of github.

But, ok. I calm down.

funny-falcon avatar Oct 14 '23 08:10 funny-falcon

Sorry for long delay.

Merged master into branch. Added commit with review fixes.

funny-falcon avatar Apr 18 '24 16:04 funny-falcon