dex-lang icon indicating copy to clipboard operation
dex-lang copied to clipboard

Associative Map Example

Open oxinabox opened this issue 4 years ago • 7 comments

This is pretty neat I think Closes #332 It has a List backed dictionary and a hash based dictionary

TODO

  • [ ] Put under common interface (i need to workout why it is giving so many errors)
  • [x] Use new interface/instance stuff, not @instance
  • [x] Implement a fast hashing algorithm
  • [x] Split Hashable into seperate type classes for isequal and hash make Hashable use Eq, and don't make Float hashable

Maybe TODO

  • [ ] Deleting
  • [ ] Overwriting (right now it basically leaks memory if you overwrite anything)

Not TODO:

  • [ ] Rebucketting when too full (that seems like it would complicate things a bit too much for this PR)

oxinabox avatar Dec 12 '20 16:12 oxinabox

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Dec 12 '20 16:12 google-cla[bot]

And i made the bot sad again because i committed based on top of the wrong branch. 😂

oxinabox avatar Dec 12 '20 16:12 oxinabox

Benchmarks

  • Dex listdict: 6us

  • Dex hashdict: 9us

  • Julia hashdict: 0.07 us

  • Julia littledict: 0.33 us

  • Dex hash: 8us, that's the bottleneck

  • Julia hash: 0.06 us

Dex

full_data = for i:(Fin 256). ([10 * ordinal i, ordinal i], i)
big_listdict = fold (emptyListDict _ _)  \i state. storeListDict state (fst full_data.i) (snd full_data.i)
big_hashdict = fold (emptyHashDict 128 _ _) \i state. storeHashDict state (fst full_data.i) (snd full_data.i)
%time
tryGetHashDict big_hashdict [1280, 128]
(Just (128@Fin 256))

Compile time: 150.174 ms
Run time:     9.000 us 
%time
thehash [1280, 128]
1720020324

Compile time: 74.541 ms
Run time:     8.000 us 
%time
tryGetListDict big_listdict [1280, 128]
(Just (128@Fin 256))

Compile time: 130.772 ms
Run time:     6.000 us 

Julia

The OrderedCollections.LittleDict is not quite the list dict implemented here, but it is close. Its what I wanted to implement but couldn't get to work, 1 list of keys, one list of values. (and I don't know what the Julia Dict uses for collisions, but i am sure it isn't a LittleDict


julia> using OrderedCollections, BenchmarkTools

julia> const full_data = [([10ii, ii]=>ii) for ii in 1:256];

julia> const littledict = LittleDict(full_data);

julia> const hashdict = Dict(full_data);

julia> @btime littledict[$[1280, 128]];
  330.862 ns (0 allocations: 0 bytes)

julia> @btime hashdict[$[1280, 128]];
  73.187 ns (0 allocations: 0 bytes)

julia> @btime hash($([1280, 128]))
  64.405 ns (0 allocations: 0 bytes)
0x71191e98f759495d

oxinabox avatar Dec 13 '20 17:12 oxinabox

If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?

dougalm avatar Dec 14 '20 00:12 dougalm

If the hash is the performance bottleneck, should we use a cheaper one instead of the PRNG-grade threefry?

A queck google suggest people commonly use the FNV hash algorithm which seems like it would be easy enough to implement for Int32

oxinabox avatar Dec 14 '20 16:12 oxinabox

dev branch has been merged into main, now that plotting has been revamped. dev branch will be deleted now, we can continue development on main.

I'll change the base branch of this PR to main now!

dan-zheng avatar Dec 18 '20 22:12 dan-zheng

FNV based hash doesn't seem much faster. Had to say though since i can't use %bench (#359) I suspect the time to do the FFI call is dominating perhaps

%time
threefryHash 0 1
-841280227

Compile time: 322.792 ms
Run time:     114.000 us 
%time
threefryHash 0 1
-841280227

Compile time: 864.773 ms
Run time:     8.000 us 
%time
threefryHash 0 1
-841280227

Compile time: 224.515 ms
Run time:     5.000 us 

vs

%time
fnvhash 0 1
-1717489731

Compile time: 36.299 ms
Run time:     9.000 us 
%time
fnvhash 0 1
-1717489731

Compile time: 22.220 ms
Run time:     6.000 us 
%time
fnvhash 0 1
-1717489731

Compile time: 22.274 ms
Run time:     5.000 us 

oxinabox avatar Dec 19 '20 16:12 oxinabox