gap icon indicating copy to clipboard operation
gap copied to clipboard

Performance problem with creating group actions on cosets

Open ThomasBreuer opened this issue 3 years ago • 3 comments

@frankluebeck and I have observed that computing the permutation action of a group on the right cosets of a subgroup of large index is very slow in GAP, compared with Magma.

As an example, take the sporadic simple Rudvalis group as a permutation group on 4060 points, and compute the action on the cosets of a maximal subgroup of the structure L2(25).2^2 (order 31200, index 4677120). In GAP, a natural way to formulate this is as follows.

gap> g:= AtlasGroup( "Ru" );
<permutation group of size 145926144000 with 2 generators>
gap> NrMovedPoints( g );
4060
gap> u:= AtlasSubgroup( g, 7 );
<permutation group of size 31200 with 2 generators>
gap> act:= FactorCosetAction( g, u );;
gap> Runtime();  # in milliseconds
7460501

The same in Magma looks as follows. (Yes, the numbering of maximal subgroups in GAP and Magma differs. The 7th class in GAP --ordered by decreasing subgroup order-- corresponds to the 8th class in Magma.)

Magma V2.27-3     Fri Sep  9 2022 10:57:59 on schedir  [Seed = 3083850902]
Type ? for help.  Type <Ctrl>-D to quit.
> info:= AtlasGroup("Ru");
> G:= PermutationGroup(info);
> G;
Permutation group G acting on a set of cardinality 4060
Order = 2^14 * 3^3 * 5^3 * 7 * 13 * 29
> mx:= MaximalSubgroups(G);
> U:= mx[8]`subgroup;
> act:= CosetAction(G, U);
> act;
Mapping from: GrpPerm: G to GrpPerm: $, Degree 4677120, Order 2^14 * 3^3 * 5^3 * 7 * 13 * 29
> quit;

Total time: 12.580 seconds, Total memory usage: 710.16MB

We see that Magma is several orders of magnitude faster than GAP.

ThomasBreuer avatar Sep 09 '22 11:09 ThomasBreuer

This is not so much a problem with actions, but in creating the transversal (this is probably a bad example for the algorithm used there with little reduction to shorter orbits being possible.)

t:=RightTransversal(g,u);

takes most of the time, then

act:=ActionHomomorphism(g,t,OnRight,"surjective");;
Image(act,g);

(which is all that FactorCosetAction does) is significantly quicker. (Not as fast as I would like, its probably also the lookup in the transversal). If someone in interested in updating the code -- it is both an old algorithm and an old (25 years+) implementation.

hulpke avatar Sep 09 '22 15:09 hulpke

Thanks @hulpke. I think we have (at least) two performance problems. The first is in RightTransversal, the second is in computing the permutation action of a generator from the transversal data. Each of these steps needs more time than the whole computation in Magma.

Concerning RightTransversal, if I interpret the profiling output correctly then most of the time is spent in AddCosetInfoStabChain, and inside this function, most of the time is spent in ForAll calls. (An obvious improvement would be to avoid creating a new function for each of the millions of calls, but this yields only a few percents of speedup.)

ThomasBreuer avatar Sep 15 '22 14:09 ThomasBreuer

@ThomasBreuer yes, though it should be noted that these performance problems are likely from the fact that the method used does not work well for this case (maximal subgroup of large index, acting transitively)

hulpke avatar Sep 16 '22 13:09 hulpke

Just to say that I think other factors are at play, too, e.g. the lack of fast hash tables in core GAP (we do have them in datastructures and orb, but not used in the GAP core; perhaps they need to migrate there?).

To elaborate on that, let me point out that the subgroup u here happens to be the stabilizer of the set pt := [ 1992, 3607 ]:

gap> g:= AtlasGroup( "Ru" );;
gap> u:= AtlasSubgroup( g, 7 );;
gap> pt := [ 1992, 3607 ];;
gap> s:=Stabilizer(g, pt, OnSets); time;
<permutation group of size 31200 with 7 generators>
64
gap> s = u;
true

Thus computing the orbit of this set under g plus a transversal would also essentially give us the cosets of u: I am not proposing this as a solution, but rather I am conducting an experiment: the time to compute this orbit should be a kind of lower bound to computing the transversal. Well, when I ask GAP to compute this orbit, it takes >16 minutes:

gap> OrbitLength(g, pt, OnSets);
971525

This is slow because it essentially uses AddSet(allpoints, newpoints) to keep track of the elements of the orbit so far. We can thus speed things up considerably by specifying a domain:

gap> dom:=Combinations([1..NrMovedPoints(g)],2);; time;
2671
gap> MakeImmutable(dom);; IsSet(dom);
true
gap> OrbitLength(g, dom, pt, OnSets); time;
4677120
31383

So that's a total of ~34 seconds.

Now let's compare with this trivial orbit algorithm implementation in Julia, using a hash set:

function bahn(pt::T, generators, action) where T
    orb = [pt]
    dict = Dict(pt => 1)
    for b in orb
        for g in generators
            c = action(b, g)::T
            x = get(dict, c, nothing)
            if x === nothing
                push!(orb, c)
                dict[c] = length(orb)
            end
        end
    end
    return orb
end

Then I get:

julia> g = atlas_group("Ru")
<permutation group of size 145926144000 with 2 generators>

julia> u = atlas_subgroup(g, 7)[1]
<permutation group of size 31200 with 2 generators>

julia> @time length(bahn([ 1992, 3607 ], gens(g), on_sets))
  3.887185 seconds (25.71 M allocations: 1.240 GiB)
4677120

Note that this is internally using GAP permutations and all. I can get this to a bit under 3 seconds by using "native" Julia permutations.

So, I think in this example perhaps really basic things are also playing a factor, besides Magma perhaps using a fundamentally better algorithm for this particular problem?

fingolfin avatar Sep 30 '22 13:09 fingolfin

Currently GAP does not hash integers lists under permutation actions -- this is clearly an oversight. With this added, the OrbitLength example takes (M1 Pro processor) 17 seconds with no need to indicate a domain. This probably will be faster by another factor of 2, if hashing code was moved in the kernel. (And yes, it would be worth to implement transversals based on orbits if the subgroup happens to be a nice stabilizer.)

hulpke avatar Sep 30 '22 14:09 hulpke

Update after the release of GAP 4.13.0: There were several commits about performance improvements, and as far as I can see, these changes made it into GAP 4.13. With GAP 4.13.0, I get (on the computer hamal in Aachen) the following.

gap> g:= AtlasGroup( "Ru" );
<permutation group of size 145926144000 with 2 generators>
gap> NrMovedPoints( g );
4060
gap> u:= AtlasSubgroup( g, 7 );
<permutation group of size 31200 with 2 generators>
gap> act:= FactorCosetAction( g, u );;
gap> Runtime();  # in milliseconds
5360647

The mentioned (alternative) orbit computations on pairs are apparently faster in 4.13.0 (again, on hamal) than they were in the tests mentioned in September 2022.

gap> g:= AtlasGroup( "Ru" );;
gap> u:= AtlasSubgroup( g, 7 );;
gap> pt := [ 1992, 3607 ];;
gap> s:=Stabilizer(g, pt, OnSets); time;
<permutation group of size 31200 with 7 generators>
20
gap> s = u;
true
gap> OrbitLength(g, pt, OnSets); time;
4677120
18828
gap> dom:=Combinations([1..NrMovedPoints(g)],2);; time;
2828
gap> MakeImmutable(dom);; IsSet(dom);
true
gap> OrbitLength(g, dom, pt, OnSets); time;
4677120
18685

ThomasBreuer avatar Mar 22 '24 09:03 ThomasBreuer

gap> Runtime();  # in milliseconds
5360647

I have a change (in my work repository) that speeds this up by a factor of over 2 compared to the 4.13 release -- proportionally one would expect a time of 2442941.

hulpke avatar Mar 26 '24 19:03 hulpke

With a further change it now takes (equivalent of) 30 seconds.

hulpke avatar Mar 29 '24 01:03 hulpke

Very nice: with @hulpke's PR #5689 the timings on a specific test server I see these timings:

  • Magma 2.26-6 (results vary a lot with the random seed)
    • Total time: 21.399 seconds, Total memory usage: 567.41MB
    • Total time: 32.710 seconds, Total memory usage: 638.78MB
    • Total time: 25.550 seconds, Total memory usage: 781.53MB
    • Total time: 30.690 seconds, Total memory usage: 852.91MB
  • GAP with the improved branch (with two different RNG seeds):
    • 44.722 seconds, 1687.00MB were allocated
    • 45.33 seconds, 1684.77MB were allocated

So that's great, it's on the same order of magnitude.

fingolfin avatar Apr 15 '24 11:04 fingolfin