InternedStrings.jl icon indicating copy to clipboard operation
InternedStrings.jl copied to clipboard

Consider adding back a wrapper type to use with Dicts

Open oxinabox opened this issue 7 years ago • 1 comments

I've not run benchmarks. But I would assume it is much faster to hash the pointer than to hash the string. So it might be worth adding back in a wrapper type, just to make dict lookups faster.

Dumping from Slack:

Lyndon White [11:43 AM] Anyone have any idea on the easiest way to construct a Dict with a custom hash function? (edited) My current thoughts are to create a wrapper type for my key's type. Give it a convert (Dicts automatically call convert on their keys during get index iirc). And give that wrapper type the custom hash. but I think this would be allocating, which kinda defeats the purpose of using a custom hash, to be faster.

Eric Davies [11:47 AM] I don't think that would have a multiple of extra allocations

Lyndon White [11:48 AM] wouldn't it result in 1 allocation per getindex? I guess that isn't too bad, getindex already does a few allocations iirc

Eric Davies [11:49 AM] If you have the key type be a parametric type I don't think it should? because then the caller function can expect a value of that type and allocate on the stack

Lyndon White [11:50 AM] Cool, I should benchmark it (edited)

Eric Davies [11:54 AM] This is what I was thinking, not sure if you thought the same:

struct KeyType{K}
    val::K
end

struct MyDict{K, V} <: AbstractDict{K, V}
    inner::Dict{KeyType{K}, V}
end

@inline Base.hash(kt::KeyType, h::UInt) = myhash(kt.val, h::UInt)

# rest as you described

Lyndon White [11:55 AM] I was thinking just using a normal Dict, as I don't want to have to implement the whole interface (or will inheritting AbstractDict do that for me?) (No it won't it doesn't know about inner) If it works, I should add an AssumeInterned type back into InternedStrings.jl. Which would make hash(s::AssumeInterned) = hash(pointer(s.val)) and isequals(a::AssumeInterned, b::AssumeInterned) = isequal(pointer(a.val), pointer(b.val))

Eric Davies [11:59 AM] Hmm that should work you're right

oxinabox avatar Jun 06 '18 04:06 oxinabox

and for fast equality checks

oxinabox avatar Jun 23 '18 04:06 oxinabox