Combinatorics.jl
Combinatorics.jl copied to clipboard
Improve performance of `derangement`/`subfactorial` with iterative implementation
This PR changes the implementation of derangement
, hence also subfactorial
,
to use the recursive formula !n = (n-1) * (!(n-1) + !(n-2))
presented here.
For values such as subfactorial(100)
I get a huge speed-up, like ~10x.
Codecov Report
Attention: 1 lines
in your changes are missing coverage. Please review.
Comparison is base (
e6888be
) 96.97% compared to head (afc071d
) 96.87%.
Files | Patch % | Lines |
---|---|---|
src/factorials.jl | 91.66% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #146 +/- ##
==========================================
- Coverage 96.97% 96.87% -0.10%
==========================================
Files 7 7
Lines 728 737 +9
==========================================
+ Hits 706 714 +8
- Misses 22 23 +1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
With this change I pass from
julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 185.527 μs … 47.124 ms ┊ GC (min … max): 0.00% … 60.54%
Time (median): 189.635 μs ┊ GC (median): 0.00%
Time (mean ± σ): 267.191 μs ± 1.711 ms ┊ GC (mean ± σ): 14.89% ± 2.31%
▆█▇▆▆▅▄▃▂▃▃▃▃▂▁▁▁▂▁▁▁ ▂
█████████████████████▇▇▇▅▆▇▅▄▆▆█▇█▇▇▇▇▇▆▆▅▅▃▅▄▄▃▄▄▃▁▄▁▁▄▃▃▁▃ █
186 μs Histogram: log(frequency) by time 273 μs <
Memory estimate: 81.06 KiB, allocs estimate: 2931.
to
julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 12.113 μs … 49.734 ms ┊ GC (min … max): 0.00% … 60.39%
Time (median): 15.437 μs ┊ GC (median): 0.00%
Time (mean ± σ): 29.177 μs ± 762.966 μs ┊ GC (mean ± σ): 25.19% ± 0.96%
▂▅▆█▂ ▂▄ ▄▅▁
▁▁▁▁▁▁▂▄█████▆▅███▅▅▃▅███▆▅▅▄▃▃▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
12.1 μs Histogram: frequency by time 24.2 μs <
Memory estimate: 14.09 KiB, allocs estimate: 499.
Changes
- I replaced the recursive formula with the simpler
!n = n * !(n-1) + (-1)^n
. - I replaced all operations on
BigInt
s with in-place functions fromBase.GMP.MPZ
to reduce allocations.
Consequences The performance improved by another factor ~10x:
julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
Range (min … max): 1.427 μs … 137.132 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.480 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.538 μs ± 1.549 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▃█▆▆▄▂ ▂
██████▆▃▃▃▅██▇▆▆▅▃▄▄▄▁▃▄▄▃▁▄▃▃▁▄▄▁▁▄▁▁▃▁▁▃▄▁▃▁▁▁▁▃▃▁▁▃▁▁▁▃▅ █
1.43 μs Histogram: log(frequency) by time 3.05 μs <
Memory estimate: 112 bytes, allocs estimate: 11.