dex-lang
dex-lang copied to clipboard
Associative Map Example
This is pretty neat I think Closes #332 It has a List backed dictionary and a hash based dictionary
TODO
- [ ] Put under common interface (i need to workout why it is giving so many errors)
- [x] Use new
interface
/instance
stuff, not@instance
- [x] Implement a fast hashing algorithm
- [x]
Split Hashable into seperate type classes formake Hashable use Eq, and don't make Float hashableisequal
andhash
Maybe TODO
- [ ] Deleting
- [ ] Overwriting (right now it basically leaks memory if you overwrite anything)
Not TODO:
- [ ] Rebucketting when too full (that seems like it would complicate things a bit too much for this PR)
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.
We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent.
in this pull request.
Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla
label to yes
(if enabled on your project).
ℹ️ Googlers: Go here for more info.
And i made the bot sad again because i committed based on top of the wrong branch. 😂
Benchmarks
-
Dex listdict: 6us
-
Dex hashdict: 9us
-
Julia hashdict: 0.07 us
-
Julia littledict: 0.33 us
-
Dex hash: 8us, that's the bottleneck
-
Julia hash: 0.06 us
Dex
full_data = for i:(Fin 256). ([10 * ordinal i, ordinal i], i)
big_listdict = fold (emptyListDict _ _) \i state. storeListDict state (fst full_data.i) (snd full_data.i)
big_hashdict = fold (emptyHashDict 128 _ _) \i state. storeHashDict state (fst full_data.i) (snd full_data.i)
%time
tryGetHashDict big_hashdict [1280, 128]
(Just (128@Fin 256))
Compile time: 150.174 ms
Run time: 9.000 us
%time
thehash [1280, 128]
1720020324
Compile time: 74.541 ms
Run time: 8.000 us
%time
tryGetListDict big_listdict [1280, 128]
(Just (128@Fin 256))
Compile time: 130.772 ms
Run time: 6.000 us
Julia
The OrderedCollections.LittleDict is not quite the list dict implemented here, but it is close.
Its what I wanted to implement but couldn't get to work, 1 list of keys, one list of values.
(and I don't know what the Julia Dict
uses for collisions, but i am sure it isn't a LittleDict
julia> using OrderedCollections, BenchmarkTools
julia> const full_data = [([10ii, ii]=>ii) for ii in 1:256];
julia> const littledict = LittleDict(full_data);
julia> const hashdict = Dict(full_data);
julia> @btime littledict[$[1280, 128]];
330.862 ns (0 allocations: 0 bytes)
julia> @btime hashdict[$[1280, 128]];
73.187 ns (0 allocations: 0 bytes)
julia> @btime hash($([1280, 128]))
64.405 ns (0 allocations: 0 bytes)
0x71191e98f759495d
If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?
If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?
A queck google suggest people commonly use the FNV hash algorithm which seems like it would be easy enough to implement for Int32
dev
branch has been merged into main
, now that plotting has been revamped. dev
branch will be deleted now, we can continue development on main
.
I'll change the base branch of this PR to main
now!
FNV based hash doesn't seem much faster.
Had to say though since i can't use %bench
(#359)
I suspect the time to do the FFI call is dominating perhaps
%time
threefryHash 0 1
-841280227
Compile time: 322.792 ms
Run time: 114.000 us
%time
threefryHash 0 1
-841280227
Compile time: 864.773 ms
Run time: 8.000 us
%time
threefryHash 0 1
-841280227
Compile time: 224.515 ms
Run time: 5.000 us
vs
%time
fnvhash 0 1
-1717489731
Compile time: 36.299 ms
Run time: 9.000 us
%time
fnvhash 0 1
-1717489731
Compile time: 22.220 ms
Run time: 6.000 us
%time
fnvhash 0 1
-1717489731
Compile time: 22.274 ms
Run time: 5.000 us