DataStructures.jl icon indicating copy to clipboard operation
DataStructures.jl copied to clipboard

Should SortedSet convert the key in the `in` function?

Open oxinabox opened this issue 4 years ago • 2 comments

I have been working with ShortStrings.jl a bit lately. It seems like it should not be required for in(::ShortString, ::SortedSet(String) to internally convert the ShortString to a String. cmp is defined already between those types.

cc @StephenVavasis

oxinabox avatar Oct 30 '20 14:10 oxinabox

Yes, this is a good idea. There is perhaps a larger issue: The sort ordering is built on top of Base.Ordering, but perhaps it is better to build the sort ordering more conveniently for the user on top of Base.<, Base.cmp, etc. I'm not sure cmp was available in 0.3, when I first wrote the code. I (or someone else) should overhaul the sorted-container code anyway to de-emphasize semitokens and emphasize tokens now that tokens are efficient as of 1.5. Unfortunately, I won't be able to work on an overhaul before January.

StephenVavasis avatar Oct 30 '20 15:10 StephenVavasis

I think Base.Ordering is still right. We overhauled Heaps recently towards using it. I think it eventually calls cmp anyway. And optimises away to be as if it calls that directly. I could be wrong though.

I agree re:semi-tokens.

I don't think there is a huge rush. We're not likely to complete v0.19 before January on any means. And my be benchmarking shows that for my data I M looking at today it is faster to use a plain Set. Though that surprises me for ShortStrings

oxinabox avatar Oct 30 '20 22:10 oxinabox