gap icon indicating copy to clipboard operation
gap copied to clipboard

Add custom SetIsSSortListProp for internal lists

Open fingolfin opened this issue 4 years ago • 11 comments

Resolves #4459

I wrote this very quickly, it's likely I missed something; besides, there are zero tests in there, which is why I am marking this as a draft.

(Anybody feel free to add tests, for plain lists, ranges, blists, and perhaps also an external list type, if we have any...)

fingolfin avatar May 05 '21 20:05 fingolfin

I had tried to provide some tests for the various kinds of internal lists. But the failing tests show that it is challenging to construct meaningful examples. For example, the problem with Filtered can be reproduced as follows.

gap> C:= [1..10];;  res := C{[]};;  res[1]:= 1;;  res[2]:= 2;;
gap> SetIsSSortedList( res, true );
gap> HasIsSSortedList( res );
false
gap> SetIsSSortedList( res, true );
gap> HasIsSSortedList( res );
true

The point seems to be that calling HasIsSSortedList before SetIsSSortedList changes the situation. (At least the TNUM_OBJ value of res gets changed by HasIsSSortedList( res ).)

In this sense, the following tests are too simpleminded.

gap> candidates:= [
>      [ 1, 2 ],                        # homogeneous plist, strictly sorted
>      [ 2, 1 ],                        # homogeneous plist, not strictly sorted
>      [ 1, Z(2) ],                     # plist, strictly sorted
>      [ Z(2), 1 ],                     # plist, not strictly sorted
>      BlistList( [ 1, 2 ], [ 1 ] ),    # blist, strictly sorted
>      BlistList( [ 1, 2 ], [ 2 ] ),    # blist, not strictly sorted
>      "abc",                           # string, strictly sorted
>      "cba",                           # string, not strictly sorted
>      Enumerator( SymmetricGroup(2) ), # external list, strictly sorted
>      Enumerator( SymmetricGroup(3) ), # external list, not strictly sorted
>    ];;
gap> for list in candidates do
>      if HasIsSSortedList( list ) then
>        Print( "#E  ", list, " knows already whether it is sorted?\n" );
>      fi;
>      val:= IsSSortedList( ShallowCopy( list ) );
>      SetIsSSortedList( list, val );
>      if ForAny( list, IsMutable ) then 
>        if HasIsSSortedList( list ) then
>          Print( "#E  ", list, " must not store whether it is sorted!\n" ); 
>        fi;
>      elif not HasIsSSortedList( list ) then
>        Print( "#E  ", list, " should now store whether it is sorted!\n" );
>      elif IsSSortedList( list ) <> val then
>        Print( "#E  ", list, " stores a wrong sortedness value!\n" );
>      fi;
>    od;

Currently [ Z(2)^0, 1 ] and "cba" do not want to store that they are not strictly sorted. (Internal ranges need not be tested because they store sortedness info when they are created --well, this fact can be tested. And lists containing mutable entries are not allowed to store sortedness info --this can be tested, too.)

ThomasBreuer avatar May 06 '21 12:05 ThomasBreuer

GAP doesn't store sortedness for inhomogeneous lists, which is why it doesn't store it for [ Z(2)^0, 1 ]. But for strings, we do store sortedness, so I am not sure why it fails for "cba"

fingolfin avatar May 07 '21 14:05 fingolfin

@fingolfin I do not understand the statement

GAP doesn't store sortedness for inhomogeneous lists

We have

gap> l:= [ 1, Z(2) ];;
gap> HasIsSSortedList( l );
false
gap> IsSSortedList( l );
true
gap> HasIsSSortedList( l );
true

in GAP 4.11.1. Thus sortedness of inhomogeneous lists can be stored. The question is about storing non-sortedness.

ThomasBreuer avatar May 07 '21 14:05 ThomasBreuer

Ah sorry, I was simply remembering it wrong, and confused it with non-dense lists.

We can in fact store both sortedness and non-sortedness in the TNUM for strings, inhomogeneous dense lists etc.. That it doesn't work in the failing test was caused by an empty list being generated via C{[]} that then did not have the TNUM T_PLIST_EMPTY. I've added a commit to fix that. I'll look into the rest

fingolfin avatar May 07 '21 15:05 fingolfin

For [ Z(2)^0, 1 ] the problem is that GAP did not figure out this list is dense. So this happens:

gap> l:=[ Z(2)^0, 1 ];;
gap> TNAM_OBJ(l);
"plain list"
gap> SetIsSSortedList(l,true);
gap> TNAM_OBJ(l);
"plain list"
gap> HasIsSSortedList(l);
false
gap> TNAM_OBJ(l);
"dense non-homogeneous plain list"
gap> SetIsSSortedList(l,true);
gap> TNAM_OBJ(l);
"dense non-homogeneous strictly-sorted plain list"
gap> HasIsSSortedList(l);
true

Specifically, when we invoke SetIsSSortedList(l,true) all the kernel knows about the list is that it is a plain lists -- it does not know if it is dense or not.

But HasIsSSortedList triggers method dispatch, and so the type of l is recomputed, and GAP determines it is dense. Afterwards, we can store the sortedness.

One way to fix the issue thus is to add an explicit test for density at the start of the custom setter. This is of course some overhead if it is not yet known, but in general is still much cheaper than full method dispatch.

In addition, we could improve the interpreter so that dense list literals are marked as such. In fact things look quite weird right now:

gap> TNAM_OBJ([Z(2),1]);
"plain list"
gap> TNAM_OBJ([1,Z(2)]);
"dense plain list"
gap> TNAM_OBJ(({}->[Z(2),1])());
"dense plain list"
gap> TNAM_OBJ(({}->[1,Z(2)])());
"dense plain list"

After some deliberation, I think the problem is this: the immediate mode builds the list incrementally: it starts with an empty plain list; then it assigns the first entry, the second entry, etc., all via the kernel function ASS_LIST, which tries to update the TNUM. So when the first entry Z(2) is assigned, it should turn the list

gap> l:=[Z(2)];;
gap> TNAM_OBJ(l);
"plain list"
gap> l; # forces type recomputations
[ Z(2)^0 ]
gap> TNAM_OBJ(l);
"plain list of small finite field elements"

Looking at the code, the bug was clear and I just pushed a fix for this last issue, but not yet one that computes density in general.

fingolfin avatar May 07 '21 15:05 fingolfin

I can't reproduce any issues with "cba"; I tried the following:

gap> s:="cba";;
gap> TNAM_OBJ(s);
"list (string)"
gap> SetIsSSortedList(s,true);
gap> TNAM_OBJ(s);
"list (string,ssort)"
gap> HasIsSSortedList(s);
true

and

gap> s:="cba";;
gap> SetIsSSortedList(s,false);
gap> TNAM_OBJ(s);
"list (string,nsort)"
gap> HasIsSSortedList(s);
false

Could you elaborate how to reproduce the problem you were seeing?

Nevermind, of course my second code snippet shows it 😂

fingolfin avatar May 07 '21 16:05 fingolfin

Note to self: all those fix commits should also get a test case (and then they can be merged independently of the core of this PR)

fingolfin avatar May 10 '21 08:05 fingolfin

Rebased, now things work a bit better. Trying @ThomasBreuer test, we still get

#E  [ Z(2)^0, 1 ] should now store whether it is sorted!

The problem is that this list has type T_PLIST_DENSE but there are no T_PLIST_DENSE_NSORT and T_PLIST_DENSE_SSORT.

So together with what I mentioned before, in order to store sortedness reliably (?), we'd have the list for being dense, and for being (non)homogeneous. But we don't do that right now for this mutable list:

gap> list:=[1,[2,3]];
[ 1, [ 2, 3 ] ]
gap> IsSSortedList(list);
true
gap> TNAM_OBJ(list);
"dense plain list"
gap> IsHomogeneousList(list);
false
gap> TNAM_OBJ(list);
"dense plain list"
gap> MakeImmutable(list);
[ 1, [ 2, 3 ] ]
gap> TNAM_OBJ(list);
"immutable dense non-homogeneous plain list"

But actually, for this list, we could legally store that it is non-homogenous (as it doesn't matter what the entries of the second list are).

But of course while that would help with one particular case, it wouldn't in general, e.g. here:

gap> list:=[[1],[2,3]];
[ [ 1 ], [ 2, 3 ] ]
gap> TNAM_OBJ(list);
"dense plain list"
gap> MakeImmutable(list);
[ [ 1 ], [ 2, 3 ] ]
gap> TNAM_OBJ(list);
"immutable plain list (table)"

The kernel cannot deduce anything in the mutable case here, as this is possible:

gap> list:=[[1],[2,3]];
[ [ 1 ], [ 2, 3 ] ]
gap> IsHomogeneousList(list);
true
gap> list[1][1]:=Z(2);
Z(2)^0
gap> IsHomogeneousList(list);
false

So while we can patch up a few more cases, I don't think we can make this work in general, at least not within the current framework.

I guess we could introduce T_PLIST_DENSE_NSORT and T_PLIST_DENSE_SSORT. But would that be a great idea?

And taking a step back, I wonder if this PR is a step in the right direction or not... do we really want to allow users to override sortedness like this for mutable lists? And if so, what about being dense/(non-)homogeneous/cyclotomic/table/rect table/... ?

fingolfin avatar May 11 '21 11:05 fingolfin

@fingolfin Thanks for the changes.

My idea was: For an (internal) list that has HasIsSSortedList set after a call of IsSSortedList, HasIsSSortedList should also be set after SetIsSSortedList (where the second argument is equal to the result of IsSSortedList). In the case of [ Z(2)^0, 1 ] (dense, inhomogeneous, not strictly sorted), HasIsSSortedList still returns false after the call of IsSSortedList, thus we have to leave it out from my test loop.

And I do not ask for storing more knowledge in internal lists than we can do now.

Concerning the observations about IsDenseList and IsHomogeneousList, the knowledge about their values is (logically) always available when it is needed (when one asks for the type of the list, such as in the method selection) --this is one of the performance problems with GAP's method selection and type system. Thus the GAP kernel has to work hard for this feature. I had not been aware of the fact that users can see part of these difficulties, as in this example (from GAP 4.11.1).

gap> C:= [1..10];;  res := C{[]};;  res[1]:= 1;;  res[2]:= 2;;
gap> TNAM_OBJ( res );
"plain list"
gap> IsDenseList( res );
true
gap> TNAM_OBJ( res );   
"dense plain list"

ThomasBreuer avatar May 11 '21 15:05 ThomasBreuer

So, how should we deal with cases like [ Z(2)^0, 1 ] ? Right now, SetIsSSortedList does nothing with it. We could

  • ... leave it at that;
    • a consequence of that would be that explaining whether or not SetIsSSortedList works on a given mutable plain lists is somewhat complex
  • ... say that SetIsSSortedList should always test for being dense & (in)homogeneous;
    • but as discussed, this still won't help in all cases
    • and of course it incurs overhead
  • ... only let SetIsSSortedList do something for immutable plain lists;
    • easy to explain, but somewhat arbitrary (and external list types would not necessarily follow this rule at alll)
  • ... do something else?

fingolfin avatar May 14 '21 23:05 fingolfin

I've added a test now based on @ThomasBreuer original bug report. While this is perhaps not perfect and more could be done, I think it is safe to merge: it fixes a minor issue, and is unlikely to cause problems. So instead of letting this rot uselessly for even more time, perhaps we can just get it merged?

fingolfin avatar Aug 03 '22 13:08 fingolfin