gap icon indicating copy to clipboard operation
gap copied to clipboard

Performance regression in `LtPlist` due to recursion check

Open fingolfin opened this issue 3 years ago • 2 comments

I discovered this while investigating issue #4368:

We have a test file which compares performance for PositionSorted on mutable vs. immutable lists. While looking at that, I discovered that PositionSorted underwent a substantial regression in performance between GAP 4.8 and 4.9.

I used the following test code:


time1 := [];;
time2 := [];;
GASMAN("collect");#CollectGarbage(true);
for j in [1..20] do
  l:=List([1..100000],i->[i]);
  GASMAN("partial");#CollectGarbage(false);
  t1:=Runtime(); for i in [1..100000] do a := PositionSorted(l,[i]); od; t2:=Runtime();
  Add(time1, (t2-t1));

  l := Immutable( List([1..100000],i->[i]) );
  GASMAN("partial");#CollectGarbage(false);
  t1:=Runtime(); for i in [1..100000] do a := PositionSorted(l,[i]); od; t2:=Runtime();
  Add(time2, (t2-t1));
od;
Print("total time1: ", Sum(time1), " = Sum(", time1, "), median = ", Median(time1),"\n");
Print("total time2: ", Sum(time2), " = Sum(", time2, "), median = ", Median(time2),"\n");
if Median(time1) >= 2*Median(time2) then # time1 and time2 should be about the same
  Print("Bad timings for bugfix 2006/08/28 (FL): ", time1, " >= 2*", time2, "\n"); 
fi;
QUIT;

These are the results (note that GAP 4.5-4.8 all perform similar, as do GAP 4.9-master):

$ gap-4.8 -b misc/PositionSorted-bench.g
total time1: 666 = Sum([ 40, 41, 34, 34, 31, 36, 31, 31, 36, 30, 33, 31, 32, 36, 31, 33, 30, 33, 33, 30 ]), median = 33
total time2: 654 = Sum([ 40, 32, 33, 30, 32, 37, 30, 33, 31, 34, 36, 30, 33, 30, 32, 35, 29, 34, 31, 32 ]), median = 32
$ gap-4.9 -b misc/PositionSorted-bench.g
total time1: 945 = Sum([ 52, 46, 46, 46, 47, 47, 51, 46, 46, 46, 47, 50, 47, 46, 46, 46, 46, 50, 47, 47 ]), median = 46
total time2: 942 = Sum([ 46, 47, 48, 47, 46, 48, 47, 47, 46, 46, 48, 49, 46, 47, 46, 47, 49, 48, 47, 47 ]), median = 47

Looking at the code PositionSorted itself stayed unchanged, and dispatches to POSITION_SORTED_LIST:

InstallGlobalFunction( PositionSorted, function(arg)
  if IsPlistRep(arg[1]) then
    if Length(arg) = 3 then
      return CallFuncList(POSITION_SORTED_LIST_COMP, arg);
    else
      return CallFuncList(POSITION_SORTED_LIST, arg);
    fi;
  else
    return CallFuncList(PositionSortedOp, arg);
  fi;
end);

So here it will end up calling POSITION_SORTED_LIST; let's re-run my test file but with POSITION_SORTED_LIST instead of PositionSorted:

$ gap-4.8 -b misc/PositionSorted-bench.g
total time1: 484 = Sum([ 29, 26, 28, 26, 21, 25, 26, 22, 24, 23, 24, 25, 24, 22, 23, 25, 25, 21, 24, 21 ]), median = 24
total time2: 490 = Sum([ 26, 29, 31, 22, 22, 28, 23, 24, 22, 25, 27, 24, 24, 21, 24, 27, 23, 23, 22, 23 ]), median = 24
$ gap-4.9 -b misc/PositionSorted-bench.g
total time1: 785 = Sum([ 43, 36, 37, 36, 36, 36, 37, 36, 39, 39, 36, 40, 37, 38, 47, 50, 43, 43, 39, 37 ]), median = 37
total time2: 810 = Sum([ 37, 37, 36, 37, 37, 44, 37, 38, 41, 37, 37, 46, 37, 43, 50, 49, 49, 40, 39, 39 ]), median = 38

This shows two things:

  1. the time difference for the total time is about the same as before (~300 milliseconds slower), so it looks indeed as if the issue is in POSITION_SORTED_LIST. I do not (yet) know what causes the slowdown. The relevant kernel functions (POSITION_SORTED_LIST / PositionSortedDensePlist / FuncPOSITION_SORTED_LIST) appear to be identical between GAP 4.8 and 4.9. So I am guessing it is something deeper. I guess it's time to dig out a C level profiler... Or perhaps @ChrisJefferson has a good idea?

  2. Timings just got substantially faster! But why? Well, I believe that this is caused by the seemingly innocent IsPlistRep check: it actually introduces method dispatch overhead! I reported this as issue #4423 No, this was wrong, it's probably just the overhead of calling a GAP function 100000 times, even if it is a mostly trivial one.

fingolfin avatar Apr 21 '21 11:04 fingolfin