DataStructures.jl
DataStructures.jl copied to clipboard
Check dict hashing issue
It's likely that OrderedDict
s, as implemented here, are susceptible to JuliaLang/julia#15077. This needs to be tested and handled.
Related: #221
Does #221 make this worse? From the discussion, it seems like the fix will be a more clever rehash! and the testing solution is set and get a ton of keys (rather than trying to deduce the steps that would cause the problem). I'd be happy to contribute the tests ...
Well, #221 might have to be simplified in order to address this.
I think we can trigger the bug by adding enough keys to get the slot array to the right size (256), and then adding the sequence of keys used in the test for the Julia issue. Or rather, the total number of keys needs to be at least 171 (2/3 of 256).
I encountered a problem that suggests OrderedDict
and/or OrderedSet
are somehow broken. https://github.com/bicycle1885/Automa.jl/pull/1
To clarify it, I encountered a problem that doesn't happen when using Base.Dict
and Base.Set
on Julia 0.5 but happens when using DataStructures.OrderedDict
and DataStructures.OrderedSet
. Tests fail sometimes, not always, say 10-30% of probability. Automa.jl creates many dictionaries and sets that have object keys like NFANode
and DFANode
. There is no randomness in it except hash values of objects, which is taken from object_id
. If ordered dicts and sets were drop-in replacement of usual dicts and sets, this problem won't happen. So I think this is due to DataStructures.jl.
Don't know if this is the same thing, but I'm encountering missing keys accessed directly from the SortedDict. I have a custom concrete struct (Virus{UInt
} that is immutable, and isbits
. I don't know how to paste the exact data in for a bug report. But it doesn't happen when I repackage in a standard Base.Dict. Looks like this
julia> lt.root_nodes
SortedDict{Virus{UInt64}, UInt64, Base.Order.ForwardOrdering} with 11 entries:
...
julia> (v1,l1) = first(lt.root_nodes)
Virus{UInt64}(0x0000000000000002, 0x00000000000097dd, 0x00000000000097e0, 976.4969064069165, 977.7852658472708) => 0x0000000000000004
julia> haskey(lt.root_nodes,v1)
false #But you just gave this to me!
julia> dtest = Dict(lt.root_nodes...) # repackage in Dict
Dict{Virus{UInt64}, UInt64} with 11 entries:
julia> dtest[v1]
0x0000000000000004 # now it works great
This is a different issue. Could you repost it as a new issue in DataStructures.jl. And, if possible, post as much information as you have about what is triggering the bug. Ideally you could post a self-contained code that triggers the bug.
Note that SortedDict will not work properly unless your custom ordering satisfies the usual rules for an ordering: for any two keys a,b, exactly one of the relations a<b, a=b, b<a holds, and the relation < must be transitive.
ok figured. Sorry for clogging up this delicate deep hashing issue. P.S. I'm extending Base.isless, but will try Base:< Base:>, Base:== instead.
Edit: Yes my custom Base.isless was only partially ordered which lead to missing keys. Works fine when I resolved ties. Thanks @StephenVavasis! Leaving this in case it helps some other clumsy person.
I understand that this isn't a problem anymore (https://github.com/JuliaCollections/OrderedCollections.jl/commit/d3facfd32c4a74c51196d79f863e617f3ecce74e)
Oh, looking at this a second time I think we can't yet close this.