redex ZUNIONTOP/ZUNIONREVTOP do not union correctly

@levyfan trying out the commands yields duplicate and missing entries. Do you want to resolve this?

127.0.0.1:6379> ZADD z1 1 foo 2 bar 3 baz
(integer) 3
127.0.0.1:6379> ZADD z2 2 bar 3 baz 4 qaz
(integer) 3
127.0.0.1:6379> ZUNIONTOP 3 2 z1 z2 WITHSCORES
1) "foo"
2) "1"
3) "bar"
4) "2"
5) "bar"
6) "2"
127.0.0.1:6379> ZUNIONREVTOP 3 2 z1 z2 WITHSCORES
1) "qaz"
2) "4"
3) "baz"
4) "3"
5) "baz"
6) "3"

May 26 '16 15:05 itamarhaber

Sure.

If the scores of same entries in different zsets are different, I think we can choose the min one for ZUNIONTOP and max one for ZUNIONREVTOP.

For example,

ZADD z1 1 foo 2 bar 3 baz
ZADD z2 1.5 bar 3.5 baz 4 qaz

ZUNIONTOP 3 2 z1 z2 WITHSCORES
"foo"
"1"
"bar"
"1.5"
"baz"
"3"

ZUNIONREVTOP 3 2 z1 z2 WITHSCORES
"qaz"
"4"
"baz"
"3.5"
"bar"
"2"

May 27 '16 06:05 levyfan

Exactly. No duplicates are allowed as a result from a union. The final score should be the weighted min (or max if REV).

May 27 '16 15:05 itamarhaber

I think I need a hash set to remember the results. Can I use dict in redis? Or implement something like std::unordered_set/map in C?

May 31 '16 09:05 levyfan

HashGet and HashSet can certainly work, but they'll probably work slower than a native hashtable implementation. do you need this persistent or ephemeral for the context of the query?

In my module I'm using a little hash table library called khash for some ephemeral aggregations. it's not very friendly but it's super fast.

May 31 '16 10:05 dvirsky

Great, I need ephemeral. Khash is good (but full of macros), I will figure it out.

May 31 '16 13:05 levyfan

khash is weird, but once you get a hang of it, it's really simple. you can life the code from my module. I'm using it to group the same word in different locations in a document.

May 31 '16 14:05 dvirsky