AbstractAlgebra.jl icon indicating copy to clipboard operation
AbstractAlgebra.jl copied to clipboard

Performance of universal polynomials / MPolys exponent_vectors

Open karlwessel opened this issue 9 months ago • 16 comments

The method exponent_vectors is called at several places:

  • comparison of polynomials
  • hashing of polynomials
  • collecting variables occurring in a polynomial
  • others?

Each time it allocates a new array of dynamic size. This adds quickly up in runtime when code uses a lot of simple comparision. Especially for UnivPoly which can have a lot of variables.

Is it possible to improve the performance somehow?

Some ideas that all have disadvantages:

  1. Use Memory instead of Array. This is faster when allocating unitialized memory. Needs Julia 1.11
  2. Use StaticArrays. This is only possible if there is a statically known reasonable maximum number of elements an exponent vector can have.
  3. Cache results of exponent_vectors. There needs to be a robust mechanism which invalidates the cache when the exponents change. Improves performance only if there are actual cache hits.
  4. Reduce calls to exponent_vectors. E.g. comparing whether two generators are the same might be easier than comparing if two generic polynomials are the same. But not sure if there are improvements possible.

karlwessel avatar Mar 05 '25 13:03 karlwessel

To clarify: In my use case I only want to hash, compare or list generators. So if there are improvements possible for those special cases of polynomials it would be helpful for me.

karlwessel avatar Mar 05 '25 14:03 karlwessel

Another possible performance improvement would be to only "cache" the memory allocated for the exponent vector in the struct of the polynomial. This way the cache only needs to be invalidated when size of the exponent vector changes. This however is harder to be made threadsafe.

karlwessel avatar Mar 05 '25 14:03 karlwessel

Can you specifiy which polynomials we are talking about? Equality of multivariate polynomials in Nemo usually calls directly into flint and there is no exponent_vectors call.

thofma avatar Mar 05 '25 14:03 thofma

UnivPoly, although that one is part of AbstractAlgebra, it does use the MPoly from this library.

karlwessel avatar Mar 05 '25 14:03 karlwessel

Can you please post a full example to reproduce the slow equality test for MPolys?

thofma avatar Mar 05 '25 15:03 thofma

Here is one for the equality test of generators. It compares the speed for generators x from a UnivPoly to the speed for generators y from an MPoly. The benchmark for comparing two symbols serves as the lower bound on how long it could take.

julia> using Nemo

Welcome to Nemo version 0.49.0

Nemo comes with absolutely no warranty whatsoever

julia> using BenchmarkTools

julia> R10, ys10 = polynomial_ring(QQ, [Symbol("y$i") for i in 1:10])
(Multivariate polynomial ring in 10 variables over QQ, QQMPolyRingElem[y1, y2, y3, y4, y5, y6, y7, y8, y9, y10])

julia> R20, ys20 = polynomial_ring(QQ, [Symbol("y$i") for i in 1:20])
(Multivariate polynomial ring in 20 variables over QQ, QQMPolyRingElem[y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11, y12, y13, y14, y15, y16, y17, y18, y19, y20])

julia> Ru = universal_polynomial_ring(QQ)
Universal Polynomial Ring over Rational field

julia> xs10 = [gen(Ru, Symbol("x$i")) for i in 1:10]
10-element Vector{AbstractAlgebra.Generic.UnivPoly{QQFieldElem}}:
 x1
 x2
 x3
 x4
 x5
 x6
 x7
 x8
 x9
 x10

julia> x, y = first.([xs10, ys10])
2-element Vector{RingElem}:
 x1
 y1

julia> @benchmark :x == :x
BenchmarkTools.Trial: 10000 samples with 1000 evaluations per sample.
 Range (min … max):  1.951 ns … 33.042 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.040 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.125 ns ±  0.627 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▃           █                           ▃                  
  ██▂▁▁▁▁▁▁▁▁▁▅█▅▂▁▁▂▁▁▁▂▂▂▂▆▅▂▁▂▁▁▁▁▁▁▁▁▁▆█▃▂▁▂▁▁▁▁▁▂▂▂▁▂▅▆ ▃
  1.95 ns        Histogram: frequency by time        2.32 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark y == y
BenchmarkTools.Trial: 10000 samples with 989 evaluations per sample.
 Range (min … max):  33.834 ns … 568.265 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     35.327 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   38.154 ns ±  13.567 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅▇█▄▃▁▃▂▂▁▂▁▁▁  ▁  ▁           ▁                             ▂
  █████████████████████▇▇▇▆▇██▇████▇▆▆▆▆▇▇▇▆▆▆▄▁▅▄▄▄▅▄▅▄▄▁▄▁▄▄ █
  33.8 ns       Histogram: log(frequency) by time      76.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark x == x
BenchmarkTools.Trial: 10000 samples with 482 evaluations per sample.
 Range (min … max):  218.834 ns … 933.672 μs  ┊ GC (min … max):  0.00% … 65.59%
 Time  (median):     225.392 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   449.474 ns ±  11.744 μs  ┊ GC (mean ± σ):  27.40% ±  1.12%

  █▅▄▄▃▂▂▁ ▁       ▁▂▂▁▁                                        ▁
  ████████████▆▇▆▆███████▆▇▅▄▅▆▆▅▆▇▇▆▆█▇▇▆▇▆▆▆▆▅▅▅▅▅▅▃▄▅▃▃▃▄▅▄▃ █
  219 ns        Histogram: log(frequency) by time        642 ns <

 Memory estimate: 192 bytes, allocs estimate: 6.

julia> xs20 = [gen(Ru, Symbol("x$i")) for i in 1:20]
20-element Vector{AbstractAlgebra.Generic.UnivPoly{QQFieldElem}}:
 x1
 x2
 x3
 x4
 x5
 x6
 x7
 x8
 x9
 x10
 x11
 x12
 x13
 x14
 x15
 x16
 x17
 x18
 x19
 x20

julia> x, y = first.([xs20, ys20])
2-element Vector{RingElem}:
 x1
 y1

julia> @benchmark y == y
BenchmarkTools.Trial: 10000 samples with 989 evaluations per sample.
 Range (min … max):  33.832 ns … 590.576 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     35.686 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   39.926 ns ±  17.521 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅█▆▅▃▃▃▂▂▁▁▂▁▁  ▁     ▁  ▁▁                                  ▁
  █████████████████▇██▇███████▇▇█▆▇▆▇▆▆▆▆▆▅▇▆▆▅▅▄▅▅▄▄▅▄▅▄▅▄▃▄▅ █
  33.8 ns       Histogram: log(frequency) by time      85.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark x == x
BenchmarkTools.Trial: 10000 samples with 337 evaluations per sample.
 Range (min … max):  248.947 ns … 264.536 μs  ┊ GC (min … max):  0.00% … 59.56%
 Time  (median):     274.015 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   552.424 ns ±   6.393 μs  ┊ GC (mean ± σ):  26.31% ±  2.35%

  ▇██▇▅▅▄▂▂▁ ▁ ▁▁▃▃▃▂▂▁▁▂▂▂▃▃▂▁▁                                ▂
  ████████████▇██████████████████▇▇▇▇▇▇▆▆▆▆▅▆▃▅▅▅▅▆▅▅▅▄▄▅▅▅▃▅▅▄ █
  249 ns        Histogram: log(frequency) by time        809 ns <

 Memory estimate: 352 bytes, allocs estimate: 6.

Comparing two generators x from a UnivPoly is more than 5 times slower than comparing two generators y from a normal MPoly. ~~And it might be that the time needed for UnivPoly grows with the number of generators.~~

Comparing two generators from an MPoly is 10 times slower than comparing two symbols. While that might not be the fairest comparison it gives a lower bound on how fast that kind of operation could be.

karlwessel avatar Mar 06 '25 09:03 karlwessel

It does not get slower for Polynomial Rings with more generators:

julia> R100, ys100 = polynomial_ring(QQ, [Symbol("y$i") for i in 1:100])
(Multivariate polynomial ring in 100 variables over QQ, QQMPolyRingElem[y1, y2, y3, y4, y5, y6, y7, y8, y9, y10  …  y91, y92, y93, y94, y95, y96, y97, y98, y99, y100])

julia> xs100 = [gen(Ru, Symbol("x$i")) for i in 1:100]
100-element Vector{AbstractAlgebra.Generic.UnivPoly{QQFieldElem}}:
 x1
 x2
 x3
 x4
 x5
 x6
 x7
 x8
 x9
 x10
 x11
 x12
 x13
 x14
 x15
 x16
 x17
 x18
 x19
 x20
 ⋮
 x82
 x83
 x84
 x85
 x86
 x87
 x88
 x89
 x90
 x91
 x92
 x93
 x94
 x95
 x96
 x97
 x98
 x99
 x100

julia> x, y = first.([xs100, ys100])
2-element Vector{RingElem}:
 x1
 y1

julia> @benchmark y == y
BenchmarkTools.Trial: 10000 samples with 988 evaluations per sample.
 Range (min … max):  33.852 ns … 210.818 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     35.328 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   37.085 ns ±   6.570 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄▆▇█▄▃▂▁▁▂▂▁▁                                                ▁
  █████████████▇▇▇█▇▇██▇▆▇▇▆▆▇▇█▆▄▆▅▇▆▄▅▆▆▇▆▄▆▆▇▆█▇▆▄▂▄▄▄▄▅▅▅▅ █
  33.9 ns       Histogram: log(frequency) by time      62.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark x == x
BenchmarkTools.Trial: 10000 samples with 244 evaluations per sample.
 Range (min … max):  277.463 ns … 492.389 μs  ┊ GC (min … max):  0.00% … 72.41%
 Time  (median):     524.262 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   789.157 ns ±  10.656 μs  ┊ GC (mean ± σ):  26.88% ±  2.02%

   █▃               ▆▁ ▂                                         
  ▄███▄▃▂▂▂▁▂▂▂▁▁▁▂█████▄▄▄▅▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  277 ns           Histogram: frequency by time          1.1 μs <

 Memory estimate: 512 bytes, allocs estimate: 6.

karlwessel avatar Mar 06 '25 09:03 karlwessel

Thanks for the clarification. The universal polynomial comparison is just missing a

if parent(data(a)) === parent(data(b))
  return data(a) == data(b)
end

right at the top. That should make it as fast as the MPoly version in your examples.

(One could also think about calling univ_promote before, but that would be a bit more intrusive.)

thofma avatar Mar 06 '25 09:03 thofma

But there are cases when two generators from the same UnivPoly can be the same, while having different parent, right? At least that's what I already observed I think.

karlwessel avatar Mar 06 '25 09:03 karlwessel

Here is the case where the generators are the same, but ~~it can't compare their data against each other~~ the parents of their data differ:

julia> using Nemo

Welcome to Nemo version 0.49.0

Nemo comes with absolutely no warranty whatsoever

julia> R = universal_polynomial_ring(QQ)
Universal Polynomial Ring over Rational field

julia> x = gen(R, :x)
x

julia> x2 = (x-1)+1
x

julia> data(x2) == data(x)
true

julia> gen(R, :y)
y

julia> x = gen(R, :x)
x

julia> data(x2) == data(x)
ERROR: parents do not match
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] check_parent
   @ ~/.julia/packages/AbstractAlgebra/lYBCD/src/AbstractAlgebra.jl:221 [inlined]
 [3] check_parent
   @ ~/.julia/packages/AbstractAlgebra/lYBCD/src/AbstractAlgebra.jl:220 [inlined]
 [4] ==(a::QQMPolyRingElem, b::QQMPolyRingElem)
   @ Nemo ~/.julia/packages/Nemo/VL5Dv/src/flint/fmpq_mpoly.jl:415
 [5] top-level scope
   @ REPL[10]:1

julia> x2 == x
true

julia> parent(x2) == parent(x)
true

karlwessel avatar Mar 06 '25 10:03 karlwessel

Here is a shorter MWE (sorry for the spam):

julia> using Nemo

Welcome to Nemo version 0.49.0

Nemo comes with absolutely no warranty whatsoever

julia> R = universal_polynomial_ring(QQ)
Universal Polynomial Ring over Rational field

julia> xold = gen(R, :x)
x

julia> y = gen(R, :y)
y

julia> xnew = gen(R, :x)
x

julia> xold == xnew
true

julia> data(xold) == data(xnew)
ERROR: parents do not match
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] check_parent
   @ ~/.julia/packages/AbstractAlgebra/lYBCD/src/AbstractAlgebra.jl:221 [inlined]
 [3] check_parent
   @ ~/.julia/packages/AbstractAlgebra/lYBCD/src/AbstractAlgebra.jl:220 [inlined]
 [4] ==(a::QQMPolyRingElem, b::QQMPolyRingElem)
   @ Nemo ~/.julia/packages/Nemo/VL5Dv/src/flint/fmpq_mpoly.jl:415
 [5] top-level scope
   @ REPL[9]:1

karlwessel avatar Mar 06 '25 10:03 karlwessel

We could optimize the comparison of those special elements (the "generators"), but for arbitrary elements it will still be slow. Would that help?

thofma avatar Mar 07 '25 14:03 thofma

Yes, that would definetly help for my use case. But dont' feel pressed to optimize it, I only used it for some experimental code and not for any really necessary calculations.

This is more about documenting the possible bottleneck, in case there is an easy fix or more people need it for performance reasons.

karlwessel avatar Mar 07 '25 14:03 karlwessel

What is your use case? Often adding new variables? What is slow in the current implementation is when a new variable is added and then "new elements" are compared with "old element". Old and new refers to "before new variable" and "after new variable".

thofma avatar Mar 07 '25 14:03 thofma

I used it as a backend for a "Polyform" based CAS System.

A more realistic use case might be replacing DynamicPolynomials.jl as backend for SymbolicUtils Polyform to implement missing features for Polynomials in Symbolics.jl.

In both cases non-polynomial terms are represented as new variables in the Polynomial and therefor the number of variables quickly grows and changes when terms are substituted or terms reordered. And for differentiation you need to check which terms depend on the variable to differentiate for, which are typically generators of the polynomial.

karlwessel avatar Mar 07 '25 15:03 karlwessel

I added some of the improvements over at https://github.com/Nemocas/AbstractAlgebra.jl/pull/2028. Should make comparison of "generators" much quicker.

thofma avatar Mar 08 '25 09:03 thofma