IntervalSets.jl
IntervalSets.jl copied to clipboard
rev=true in searchsorted_interval
this also simplifies findall
no new tests added: rely on the extensive findall testsuite
Codecov Report
Patch coverage: 100.00% and project coverage change: +0.29% :tada:
Comparison is base (
b3f0d6e) 98.81% compared to head (fa4cdbe) 99.10%. Report is 23 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #111 +/- ##
==========================================
+ Coverage 98.81% 99.10% +0.29%
==========================================
Files 3 4 +1
Lines 253 224 -29
==========================================
- Hits 250 222 -28
+ Misses 3 2 -1
| Files Changed | Coverage Δ | |
|---|---|---|
| src/findall.jl | 100.00% <100.00%> (ø) |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hm, that's... interesting.
Basically, the issue is that without lt = < Base searchsorted functions use isless, that doesn't consider +-0 to be equal:
julia> searchsorted([-1, -0.0, 0.0, 1], 0)
3:3
julia> searchsorted([-1, -0.0, 0.0, 1], 0; lt=<)
2:3
As IntervalSets use < for all comparisons, searchsorted_interval should be consistent with that. And indeed, with lt = < the results are always correct.
Performance for arrays stays the same with or without lt = <, but not for ranges. They special-case the default ordering (without lt = ...) and just mathematically compute the resulting indices at O(1). With lt = <, however, searchsorted functions fall back to the abstractarray implementations with O(log N) steps.
Not sure what's the best solution here.
To be consistent with interval in, searchsorted_interval has to use lt = < or do some postprocessing of searchsorted results. It wasn't correct for +-0 before this PR. At the same time, this change somewhat degrades its performance for ranges.
For example, I can roll back the old findall implementation, while keeping searchsorted_interval as in this PR. Then, findall would behave exactly as before, and searchsorted_interval would become (i) correct and (ii) somewhat slower for ranges.
The updated implementation is both correct (findall tests pass, and they do check for zeros of both signs), and keeps the performance for ranges (Base searchsorted funcs are called without lt=< argument).
This PR also fixes https://github.com/JuliaMath/IntervalSets.jl/issues/100 btw.
This PR also fixes https://github.com/JuliaMath/IntervalSets.jl/issues/100 btw.
Great! Could you add the following tests in test/findall.jl?
@testset "zero-step ranges" begin
r = range(1,1,10)
@test findall(in(1..3),r) == 1:10
@test findall(in(2..3),r) |> isempty
end
Basically, the issue is that without lt = < Base searchsorted functions use isless, that doesn't consider +-0 to be equal:
Thanks! I didn't know that property of searchsorted.
and keeps the performance for ranges (Base searchsorted funcs are called without lt=< argument).
Hmm, the performance is now getting much better than the first benchmark, but there still be some performance regressions:
On the current master
julia> using IntervalSets, BenchmarkTools
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 0.020 ns … 0.031 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 0.020 ns ┊ GC (median): 0.00%
Time (mean ± σ): 0.021 ns ± 0.004 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
█▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▂
0.02 ns Histogram: frequency by time 0.03 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark findall(∈($(1..3)), $(2:8))
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.222 ns … 18.574 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.332 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.383 ns ± 0.229 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▅▃█ ▄
▂▂▁▂▁▂▁▂▂▂▂▁▄▁▇▁▃███▁█▁▂▁▂▂▁▁▁▁▁▁▁▁▂▁▃▁▅▂▂▆▁▅▁▃▁▂▂▂▂▁▂▁▅▁▄▃ ▃
11.2 ns Histogram: frequency by time 11.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
With this PR
julia> using IntervalSets, BenchmarkTools
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.442 ns … 5.931 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.472 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.471 ns ± 0.097 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ █
██▂▃█▃▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▂▁▂▁▁▁▁▁▂▂▁▂▂▁▂▂▁▂▂▁▁▂▁▂▂▁▂▂▂ ▂
1.44 ns Histogram: frequency by time 1.82 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark findall(∈($(1..3)), $(2:8))
BenchmarkTools.Trial: 10000 samples with 927 evaluations.
Range (min … max): 112.177 ns … 1.880 μs ┊ GC (min … max): 0.00% … 93.46%
Time (median): 119.698 ns ┊ GC (median): 0.00%
Time (mean ± σ): 125.097 ns ± 87.030 ns ┊ GC (mean ± σ): 3.47% ± 4.66%
▂▄▅▅▅▇▇▇▇█▇▄▃▂▂▁ ▁ ▂
▆██████████████████▆▇▆▅▆▅▄▆▅▄▃▃▄▁▄▁▄▄▆▅▆▆▇▆▅▆▆▆▆▇███████▇▅▆▇ █
112 ns Histogram: log(frequency) by time 157 ns <
Memory estimate: 128 bytes, allocs estimate: 4.
Sorry I have not taken the time to review. I have not yet checked the reasons for the performance degression.
Maybe, the time difference is some benchmarking artifact. At least the timings definitely looks suspicious, in both cases.
findall(∈(1..3), 2:8)changing from 0.02 ns to 1.47 ns. The first timing is totally implausible, definitely an artifact. 1.5 ns is close to the fastest computation possible, even in principle.findall(∈($(1..3)), $(2:8))being significantly slower than without dollars, for both IntervalSets versions. This is actually the first time I see$making benchmark slower. But it occurs with very simple Base code as well:@btime 1:2:1000gives 2 ns,@btime 1:$(2):100010 ns. Also, very surprising that allocations appear with dollars in benchmark. I don't understand this at all.
Running both (old and new) versions manually in a loop gets 1.5s for 10^9 iterations, so 1.5 ns - agrees with btime without dollars for the new version. No allocations, of course.
Added the zero-step tests.
Anything else remains to be done here?..
I checked the benchmarks again, and it seems there still be some performance degressions with findall.
Before this PR
julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.332 ns … 18.383 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.443 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.496 ns ± 0.203 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▃ ▃ ▂ ▇ █ ▂ ▂ ▁ ▁ ▂ ▂ ▆ ▅▂▅ ▃ ▂
▇▆▁▅▁██▁▃▁▁█▁█▁██▁█▁▆█▁█▁▆█▁█▁█▅▇█▁█▅█▇▁████▁▆▁▃▄▁▇▁▆▇▁▅▁▇█ █
11.3 ns Histogram: log(frequency) by time 11.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = 2.0:8.0
2.0:1.0:8.0
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 980 evaluations.
Range (min … max): 62.996 ns … 186.444 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 64.929 ns ┊ GC (median): 0.00%
Time (mean ± σ): 65.645 ns ± 5.061 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▆▇█▆▁▁▁▁▁ ▂ ▂
███████████▇▅▅▄▄▄▄▄▃▃▁▃▁▁▃▁▁▃▁▃▆▄█▁▄▃▁▃▄▄▁▁▄▄▃▆▅▄▅▃▃▄▃▅▁▅▄▅▅ █
63 ns Histogram: log(frequency) by time 99.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
After this PR
julia> using IntervalSets, BenchmarkTools
[ Info: Precompiling IntervalSets [8197267c-284f-5f27-9208-e0e47529a953]
julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 64.330 ns … 1.845 μs ┊ GC (min … max): 0.00% … 94.39%
Time (median): 71.279 ns ┊ GC (median): 0.00%
Time (mean ± σ): 74.554 ns ± 56.606 ns ┊ GC (mean ± σ): 2.72% ± 3.43%
▂▆▇██▇▆▄▂▁▁▁▂▁ ▁▂▂▁▁ ▂
▂▃▃▂▂▂▃▂▂▇███████████████▆▅▅▄▅▅▄▅▂▅▄▆▆▆▄▅▅▆███████▇▆▅▆▅▆▆██ █
64.3 ns Histogram: log(frequency) by time 92.8 ns <
Memory estimate: 64 bytes, allocs estimate: 2.
julia> r = 2.0:8.0
2.0:1.0:8.0
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
Range (min … max): 32.466 ns … 62.311 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 32.909 ns ┊ GC (median): 0.00%
Time (mean ± σ): 33.192 ns ± 1.534 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▆█▄▆ ▁ ▂
█████▅▅▄▄▆▇▆▄▆▆▇▆▆▆▁▅▅▅▆▇▆▇▅▅▅▄▄▃▃▃▄▃▃▁▃▃▁▁▃▁▃▃▃▃▁▁▃▁▁▄▁▃▅█ █
32.5 ns Histogram: log(frequency) by time 42.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
If r is a range with integers, then the current master branch wins, but if r is a range with floats, then this PR wins.
findall(∈($(1..3)), $(2:8))being significantly slower than without dollars, for both IntervalSets versions. This is actually the first time I see$making benchmark slower. But it occurs with very simple Base code as well:@btime 1:2:1000gives 2 ns,@btime 1:$(2):100010 ns. Also, very surprising that allocations appear with dollars in benchmark. I don't understand this at all.
I also don't understand the behavior well, but I thought it would be better to use variables such as i and r to clarify the difference in the performances.
BenchmarkTools.jl documentation
Maybe, it seems better to split this PR into
- Add
revkeyword argument insearchsorted_intervalfunction. - Simplify
findallfunction.
to merge this PR quickly.
The searchsorted function take O(log(n)) time, but the original findall function take O(1) time.
Bedore this PR
julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.332 ns … 37.027 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.442 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.741 ns ± 1.556 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▅ ▁ ▁ ▁
██▄▄▄▃▃▁▄▅▄▃▄▄▅▄▆▅▃██▅▄▁▅▃▁▁▄▃▁▁▃▄▁▃▃▁▃▁▄▃▁▃▁▁▁▃▁▁▁▁▃▁▁▃▁▆█ █
11.3 ns Histogram: log(frequency) by time 21.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -2000000:80000000
-2000000:80000000
julia> @benchmark findall(∈($i), $r) # Benchmark result remains mostly the same.
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 12.475 ns … 22.756 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 12.567 ns ┊ GC (median): 0.00%
Time (mean ± σ): 12.619 ns ± 0.280 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▅ ▅▁ ▄ █▂▄ ▂ ▃ ▅ ▆▄ ▁ ▃ ▂
▇█▁██▁▅▆▇▁▁▇█▁███▅▁██▁▇█▁█▁▁▁▁▁▁▁▃▁▄█▁██▁▇▇▄▇▁▄█▁█▁▁▃▁▃▃▁▁▄ █
12.5 ns Histogram: log(frequency) by time 12.8 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
After this PR
julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 973 evaluations.
Range (min … max): 72.635 ns … 1.769 μs ┊ GC (min … max): 0.00% … 93.18%
Time (median): 74.240 ns ┊ GC (median): 0.00%
Time (mean ± σ): 77.692 ns ± 58.474 ns ┊ GC (mean ± σ): 2.70% ± 3.43%
▄██▇▇▆▄▃▂▁▁▁▁▁▁ ▁▁▁▁▁ ▁ ▂
▆███████████████▇▆▅▅▅▅▅▅▅▅▄▄▅▅▄▁▁▅▅▃▃▃▄█████████▇▇▇▆▆▅▆▆▇▇█ █
72.6 ns Histogram: log(frequency) by time 96.3 ns <
Memory estimate: 64 bytes, allocs estimate: 2.
julia> r = -2000000:80000000
-2000000:80000000
julia> @benchmark findall(∈($i), $r) # Takes much time than small r.
BenchmarkTools.Trial: 10000 samples with 967 evaluations.
Range (min … max): 81.032 ns … 2.237 μs ┊ GC (min … max): 0.00% … 95.78%
Time (median): 84.948 ns ┊ GC (median): 0.00%
Time (mean ± σ): 90.104 ns ± 88.516 ns ┊ GC (mean ± σ): 4.38% ± 4.27%
▄▇▇█████▇▆▄▃▂▂▁ ▁ ▁ ▃
▆█████████████████▇▆▇▆▆▇▇▆▅▆▃▁▅▅▁▁▃▃▁▃▅▁▅▆▆████████▇▆▇▇▇██▇ █
81 ns Histogram: log(frequency) by time 116 ns <
Memory estimate: 96 bytes, allocs estimate: 4.
Maybe, it seems better to split this PR into Add rev keyword argument in searchsorted_interval function. Simplify findall function. to merge this PR quickly.
"Quickly" is loooong gone already :)
And it makes sense to have a performant searchsorted_interval function, there's no real reason why it should be slower than the current findall. To my understanding, there is no slowdown currently in this PR - but if there is, performance should be fixed for both findall and searchsorted_interval.
The searchsorted function take O(log(n)) time, but the original findall function take O(1) time.
Not sure if I understand you here... Both searchsorted and searchsorted_interval definitely take O(1) time for ranges:
julia> @btime searchsorted_interval(r, i) setup=(r=-10:10; i=0..100)
2.206 ns (0 allocations: 0 bytes)
11:21
julia> @btime searchsorted_interval(r, i) setup=(r=-10^9:10^9; i=0..100)
2.206 ns (0 allocations: 0 bytes)
1000000001:1000000101
I checked the benchmarks again, and it seems there still be some performance degressions with findall.
As I said before, it does look like a BenchmarkTools artifact. For example, @btime findall(∈(i), r) setup=(i = 1..3; r = 2:8) takes 2 ns for me, either for ints or floats, either in current master or with this PR.
"Quickly" is loooong gone already :)
Sorry for that delay :smile:
Not sure if I understand you here... Both searchsorted and searchsorted_interval definitely take O(1) time for ranges:
The searchsorted family functions use binary search, so it increases computing time with the length of r.
julia> r = -2:8
-2:8
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 4.919 ns … 14.187 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.959 ns ┊ GC (median): 0.00%
Time (mean ± σ): 4.963 ns ± 0.156 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▁▁▁▁▁▁▁▁▅▄▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▇▆▁▁▁▁▁▁▁▁▃ ▂
4.92 ns Histogram: frequency by time 4.98 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -20:80
-20:80
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 10.721 ns … 27.479 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 10.801 ns ┊ GC (median): 0.00%
Time (mean ± σ): 10.849 ns ± 0.465 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▆ █
▃█▁▇▂▁▂▂▂▂▁▆█▁▃▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▂▂▄▁▃▃▁▁▁▂▂▁▃▃▁▄▂ ▂
10.7 ns Histogram: frequency by time 11.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -200:800
-200:800
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 17.026 ns … 45.697 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 17.348 ns ┊ GC (median): 0.00%
Time (mean ± σ): 17.412 ns ± 0.465 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ ▂ █ ▆▆▃▁▁ ▂
█▁██▁▁▅███████▁▃▄▃▄▅▁▁▃▁▁▁▃▃▃▃▃▁▅▁▁▄▁▄▁▄▁▄▅▄▅▁▅▅▃▃▄▃▅▁▁▃▃▄▅ █
17 ns Histogram: log(frequency) by time 19.5 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -2000:8000
-2000:8000
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
Range (min … max): 24.282 ns … 48.344 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 24.474 ns ┊ GC (median): 0.00%
Time (mean ± σ): 24.548 ns ± 0.618 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▃▁▄█▂▅▂▂▂▂▂▁▁▂▂▂▂▁▂▁▁▂▂▁▁▂▂▁▁▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂ ▂
24.3 ns Histogram: frequency by time 27.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
I'm not sure how the compiler or BenchmarkTools.jl optimizes in searchsorted_interval, but ideally, the performance should depend on the length of r.
The searchsorted family functions use binary search
searchsorted definitely doesn't use binary search for ranges by default, see around https://github.com/JuliaLang/julia/blob/24cb92d1e94367c5ef758545b69133ded4559763/base/sort.jl#L230. That's why I expect searchsorted_interval to be as performant as the current findall.
searchsorteddefinitely doesn't use binary search for ranges by default
Oh, that was my mistake. Thanks for the clarification.
(This was already explained in this comment https://github.com/JuliaMath/IntervalSets.jl/pull/111#issuecomment-1202197404, but I overlooked it.)
Well I don't think you should in the current state :) I only needed this functionality in one place, just monkeypatched it there - so no hurry.
The current explicit comparison with zero here is a hack. I think searchsorted_interval should just use searchsorted(lt=<) as it did in one of the commits. The only issue was slower performance on ranges (O(log(N)) instead of O(1)), and it'll be fixed in https://github.com/JuliaLang/julia/pull/50365.
Restored it to use lt=<. I guess what remains is to wait for that Julia PR, and then add
if VERSION < ...
findall = old implementation
else
findall = new implementation
end
To avoid the performance regression for ranges.
I don't really feel like crafting separate tests for all edge cases of searchsorted_interval(rev=false/true) - findall tests cover them already, and using searchsorted_interval in findall only makes sense.
To avoid the performance regression for ranges.
I personally feel we can ignore performance degressions in old Julia versions (if the degression is not that much), so it would be okay to merge this PR without if VERSION < ... not to increase the maintenance cost.
I don't really feel like crafting separate tests for all edge cases of
searchsorted_interval(rev=false/true)-findalltests cover them already, and usingsearchsorted_intervalinfindallonly makes sense.
No problem with this direction!
I will merge this PR after https://github.com/JuliaLang/julia/pull/50365 is merged.
The PR https://github.com/JuliaLang/julia/pull/50365 was merged, so I have merged this PR.
@aplavin Thank you for the PR and for your patience while waiting for the review & merge!
Thanks @hyrodium! Would be nice to release it as well. This PR also fixes a bug in interacting with Unitful:
julia> findall(∈(0..1), (-0.1:0.1:0.1)u"°")
ERROR: BoundsError # without this PR
2:3 # with this PR
Fundamentally, the issue is with Unitful (https://github.com/PainterQubits/Unitful.jl/issues/686), but we get the fix effectively as a bonus here :)