skdb icon indicating copy to clipboard operation
skdb copied to clipboard

Specify order used for collection keys

Open jberdine opened this issue 1 year ago • 2 comments

The user-facing behavior of EagerCollection operations slice, slices, and take depends on the order relation between keys. This order is not currently specified. The implementation uses the skiplang default implementation of Orderable, of the CJSON type. This order may or may not coincide with what users expect from typescript.

jberdine avatar Dec 17 '24 15:12 jberdine

It's not clear what users would expect from typescript:

  • the Less than operator converts to primitive types, knows how to compare strings but otherwise converts to numeric values;
  • Array.sort converts to string for comparison, so it does not even use <...

(1) Ideally we'd want users to be able to specify the ordering on their collections' keys and values when calling slice(s) or take. We don't want plain lambdas so they'd have to use classes again. Heavy but why not. Though on the Skip side, neither slices nor limit can deal with a custom comparison function, we have to collect all the key-value pairs, then do the filtering manually in a mapper using the custom comparison.

(2) Another possibility would be for users to provide the key and value comparison functions when creating a new collection, e.g. as methods of the mapper class. Then:

  • either we do the same as in (1);
  • or we could wrap CJSON values to be able to use a custom comparison function, e.g. adding a constructor CJCustomCompare(inner: CJSON, cmp: function_handle) but
    • it would make values a bit heavier,
    • we should ensure we don't forget wrapping/unwrapping.
  • or we should change SKStore to be able to use a custom comparison by not relying on types to implement Orderable.

(3) Or we could start by ensuring the order on CJSON is specified and makes sense, it's not clear to me what the generic comparison should be.

mbouaziz avatar Feb 05 '25 16:02 mbouaziz

AFAIU for slice and take, only the order on keys is relevant, not that it simplifies things, just noting.

A generic comparison is not going to be able to assume that the structure of its arguments is compatible, just they are both "Json". So that already means that no semantically very meaningful definition is possible, and converting to strings or the default implementation of Orderable are equally fine. Agreeing with e.g. Array.sort for JS clients could be argued to be more canonical.

Another option would be to define values for ascending and descending for each primitive type in Json, and allow user code to provide Json-structured values with such asc/desc values at the leaves. These would essentially be a combinator library for defining comparison functions that could be interpreted skip-side without jumping back and forth across the FFI boundary a bazillion times. The interpretation would have to decide what to do with Json values that don't match the structure of the comparison. Probably something like compare them as strings, and make them all less than any well-structured value, would work, and would put the ill-typed values at the bottom of the order where they can be more likely to be returned to help early debugging.

It seems better to me to only have user-provided comparisons where strictly needed, as they will be slow, and so just for take and slice rather than for every collection at construction time.

jberdine avatar Feb 05 '25 17:02 jberdine