gap
gap copied to clipboard
Performance regression in `LtPlist` due to recursion check
I discovered this while investigating issue #4368:
We have a test file which compares performance for PositionSorted
on mutable vs. immutable lists. While looking at that, I discovered that PositionSorted
underwent a substantial regression in performance between GAP 4.8 and 4.9.
I used the following test code:
time1 := [];;
time2 := [];;
GASMAN("collect");#CollectGarbage(true);
for j in [1..20] do
l:=List([1..100000],i->[i]);
GASMAN("partial");#CollectGarbage(false);
t1:=Runtime(); for i in [1..100000] do a := PositionSorted(l,[i]); od; t2:=Runtime();
Add(time1, (t2-t1));
l := Immutable( List([1..100000],i->[i]) );
GASMAN("partial");#CollectGarbage(false);
t1:=Runtime(); for i in [1..100000] do a := PositionSorted(l,[i]); od; t2:=Runtime();
Add(time2, (t2-t1));
od;
Print("total time1: ", Sum(time1), " = Sum(", time1, "), median = ", Median(time1),"\n");
Print("total time2: ", Sum(time2), " = Sum(", time2, "), median = ", Median(time2),"\n");
if Median(time1) >= 2*Median(time2) then # time1 and time2 should be about the same
Print("Bad timings for bugfix 2006/08/28 (FL): ", time1, " >= 2*", time2, "\n");
fi;
QUIT;
These are the results (note that GAP 4.5-4.8 all perform similar, as do GAP 4.9-master):
$ gap-4.8 -b misc/PositionSorted-bench.g
total time1: 666 = Sum([ 40, 41, 34, 34, 31, 36, 31, 31, 36, 30, 33, 31, 32, 36, 31, 33, 30, 33, 33, 30 ]), median = 33
total time2: 654 = Sum([ 40, 32, 33, 30, 32, 37, 30, 33, 31, 34, 36, 30, 33, 30, 32, 35, 29, 34, 31, 32 ]), median = 32
$ gap-4.9 -b misc/PositionSorted-bench.g
total time1: 945 = Sum([ 52, 46, 46, 46, 47, 47, 51, 46, 46, 46, 47, 50, 47, 46, 46, 46, 46, 50, 47, 47 ]), median = 46
total time2: 942 = Sum([ 46, 47, 48, 47, 46, 48, 47, 47, 46, 46, 48, 49, 46, 47, 46, 47, 49, 48, 47, 47 ]), median = 47
Looking at the code PositionSorted
itself stayed unchanged, and dispatches to POSITION_SORTED_LIST
:
InstallGlobalFunction( PositionSorted, function(arg)
if IsPlistRep(arg[1]) then
if Length(arg) = 3 then
return CallFuncList(POSITION_SORTED_LIST_COMP, arg);
else
return CallFuncList(POSITION_SORTED_LIST, arg);
fi;
else
return CallFuncList(PositionSortedOp, arg);
fi;
end);
So here it will end up calling POSITION_SORTED_LIST
; let's re-run my test file but with POSITION_SORTED_LIST
instead of PositionSorted
:
$ gap-4.8 -b misc/PositionSorted-bench.g
total time1: 484 = Sum([ 29, 26, 28, 26, 21, 25, 26, 22, 24, 23, 24, 25, 24, 22, 23, 25, 25, 21, 24, 21 ]), median = 24
total time2: 490 = Sum([ 26, 29, 31, 22, 22, 28, 23, 24, 22, 25, 27, 24, 24, 21, 24, 27, 23, 23, 22, 23 ]), median = 24
$ gap-4.9 -b misc/PositionSorted-bench.g
total time1: 785 = Sum([ 43, 36, 37, 36, 36, 36, 37, 36, 39, 39, 36, 40, 37, 38, 47, 50, 43, 43, 39, 37 ]), median = 37
total time2: 810 = Sum([ 37, 37, 36, 37, 37, 44, 37, 38, 41, 37, 37, 46, 37, 43, 50, 49, 49, 40, 39, 39 ]), median = 38
This shows two things:
-
the time difference for the total time is about the same as before (~300 milliseconds slower), so it looks indeed as if the issue is in
POSITION_SORTED_LIST
. I do not (yet) know what causes the slowdown. The relevant kernel functions (POSITION_SORTED_LIST
/PositionSortedDensePlist
/FuncPOSITION_SORTED_LIST
) appear to be identical between GAP 4.8 and 4.9. So I am guessing it is something deeper. I guess it's time to dig out a C level profiler... Or perhaps @ChrisJefferson has a good idea? -
Timings just got substantially faster! But why?
Well, I believe that this is caused by the seemingly innocentNo, this was wrong, it's probably just the overhead of calling a GAP function 100000 times, even if it is a mostly trivial one.IsPlistRep
check: it actually introduces method dispatch overhead! I reported this as issue #4423