GEORADIUSBYMEMBER command
Implementation of the GEORADIUSBYMEMBER command
- [x] Basic functionality to query all members within the radius of a given member of a key
- [x] Add support for WITHDIST, WITHHASH, WITHCOORD, COUNT parameters
- [x] Replaced dependency
github.com/mmcloughlin/geohashwith a redis-like implementation of geohash to make results comparable - [x] Add support for STORE, STOREDIST parameters
- [x] Add unit and integration tests
- [x] Benchmark and potentially optimize the code
Resolves #1252.
Hi @benbarten. I have pushed validation logics for parsing optional parameters, fixed bugs in COUNT logic and updated response structure when WITHHASH/WITHCOORD/WITHDIST are used. Please review the changes.
And geohash area filter is returning elements that are outside the specified radius as well. Below example with a 100 km radius filter. But member at a 134 km distance is also being returned. Can you please look into it?
- Dice
127.0.0.1:7379> GEOADD Sicily 13.583333 37.316667 "Agrigento"
(integer) 1
127.0.0.1:7379> GEOADD Sicily 13.361389 38.115556 "Palermo" 15.087269 37.502669 "Catania"
(integer) 2
127.0.0.1:7379> GEORADIUSBYMEMBER Sicily Agrigento 100 km WITHDIST
1) 1) "Agrigento"
2) (integer) 0
2) 1) "Palermo"
2) "90.97804072951635"
3) 1) "Catania"
2) "134.46977809328365"
- Redis
redis> GEOADD Sicily 13.583333 37.316667 "Agrigento"
(integer) 1
redis> GEOADD Sicily 13.361389 38.115556 "Palermo" 15.087269 37.502669 "Catania"
(integer) 2
redis> GEORADIUSBYMEMBER Sicily Agrigento 100 km WITHDIST
1) 1) "Agrigento"
2) "0.0000"
2) 1) "Palermo"
2) "90.9778"
Thanks.
Hey @c-harish,
Thank you for pointing out those issues. I should have tested this more thoroughly beforehand. The implementation has a couple of different issues which I resolved in my last commit:
- The github.com/mmcloughlin/geohash library we used calculates geohashes in a latitude range from -90 to 90 while redis does not accept values closes to the poles. This resulted in different hashes and made it not only hard to compare between redis and dicedb. I replaced this library with a redis-like implementation of geohashes.
- Geohashes are not linear, so range queries on a sorted set can return values which are higher than the given radius. I added a post-filter step as in redis to filter out any points which exceed the radius, but match the geohash area.
- I updated the
GetMemberScoresInRangemethod of the sorted set to a half-open interval as in redis.
Please, feel free to test again. How should we split up the remaining tasks?
@c-harish I would continue working on the STORE and STOREDIST options.
Tremendous job, @benbarten. 🚀 Looks solid to me, at first glance.
@c-harish I looked into the implementation of STORE and STOREDIST this morning and I will need a few more days to finalize it.
STORE and STOREDIST store the result of the GEORADIUSBYMEMBER command for a specified key. Hence, they turn the GEORADIUSBYMEMBER command into a MultiShard command that needs to be decomposed. Since this command can be a single or a multi-shard command depending on the given options, there is no out-of-the-box way to handle this in the code.
My idea is to decompose the command into a GEORADIUSBYMEMBER command as it is right now and a GEOADD command. This will be similar to how the COPY command is implemented. The crux here is that only if those args are given we want to do a preprocessing step. I need to think about a nice way to implement this without convoluting the existing logic. I'll get back here once I figure that out.
Hi @benbarten,
I don't think we need to decompose the command or preprocess responses here, since GEOADD is a singleshard command and GEORADIUS commands also deals with a single key (so all data we need is also essentially from a single shard).
We just store the response again in a single shard, if STORE options are set.
@AshwinKul28, Can you please confirm?
As far as I understand, the shard is determined by the key.
The STORE and STOREDIST options take the result of GEORADIUSBYMEMBER and store it for a different key. Hence, we would need to access a different shard than the one that executes GEORADIUSBYMEMBER. e.g. the COPY command is similar in this regard since it takes two keys and moves data between them.
Maybe my understanding is wrong here, but I tested locally to store data for a different key directly using the Store object in evalGEORADIUSBYMEMBER and I wasn't able to then retrieve the stored set by using the new key.
Added the unit tests this morning. Will do benchmarking tomorrow and then we can decide whether to merge like this or add the STORE and STOREDIST options as part of this PR.
Benchmarks
Base command
Running tool: /opt/homebrew/bin/go test -benchmem -run=^$ -bench ^BenchmarkGEORADIUSBYMEMBER$ github.com/dicedb/dice/internal/eval
goos: darwin
goarch: arm64
pkg: github.com/dicedb/dice/internal/eval
cpu: Apple M2
=== RUN BenchmarkGEORADIUSBYMEMBER
BenchmarkGEORADIUSBYMEMBER
=== RUN BenchmarkGEORADIUSBYMEMBER/Small
BenchmarkGEORADIUSBYMEMBER/Small
BenchmarkGEORADIUSBYMEMBER/Small-8 462044 2393 ns/op 1048 B/op 39 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER/Medium
BenchmarkGEORADIUSBYMEMBER/Medium
BenchmarkGEORADIUSBYMEMBER/Medium-8 106651 27096 ns/op 4920 B/op 100 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER/Large
BenchmarkGEORADIUSBYMEMBER/Large
BenchmarkGEORADIUSBYMEMBER/Large-8 26185 214080 ns/op 27480 B/op 352 allocs/op
PASS
ok github.com/dicedb/dice/internal/eval 10.258s
Different Radii
Running tool: /opt/homebrew/bin/go test -benchmem -run=^$ -bench ^BenchmarkGEORADIUSBYMEMBER_DifferentRadii$ github.com/dicedb/dice/internal/eval
goos: darwin
goarch: arm64
pkg: github.com/dicedb/dice/internal/eval
cpu: Apple M2
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentRadii
BenchmarkGEORADIUSBYMEMBER_DifferentRadii
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentRadii/SmallRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/SmallRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/SmallRadius-8 38484 30515 ns/op 1288 B/op 47 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentRadii/MediumRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/MediumRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/MediumRadius-8 41575 28804 ns/op 13992 B/op 197 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentRadii/LargeRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/LargeRadius
BenchmarkGEORADIUSBYMEMBER_DifferentRadii/LargeRadius-8 15754 76014 ns/op 95817 B/op 1470 allocs/op
PASS
ok github.com/dicedb/dice/internal/eval 5.211s
Different options
Running tool: /opt/homebrew/bin/go test -benchmem -run=^$ -bench ^BenchmarkGEORADIUSBYMEMBER_DifferentOptions$ github.com/dicedb/dice/internal/eval
goos: darwin
goarch: arm64
pkg: github.com/dicedb/dice/internal/eval
cpu: Apple M2
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions
BenchmarkGEORADIUSBYMEMBER_DifferentOptions
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions/NoOptions
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/NoOptions
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/NoOptions-8 259896 3899 ns/op 3848 B/op 76 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithCoord
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithCoord
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithCoord-8 303004 3948 ns/op 4080 B/op 80 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithDist
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithDist
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithDist-8 297511 4064 ns/op 4056 B/op 79 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithHash
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithHash
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/WithHash-8 299065 3982 ns/op 4064 B/op 80 allocs/op
=== RUN BenchmarkGEORADIUSBYMEMBER_DifferentOptions/AllOptions
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/AllOptions
BenchmarkGEORADIUSBYMEMBER_DifferentOptions/AllOptions-8 294674 4034 ns/op 4152 B/op 82 allocs/op
PASS
ok github.com/dicedb/dice/internal/eval 6.275s
Hi, @apoorvyadav1111. Can you please trigger CI check and review? Thanks.
Hey @apoorvyadav1111,
any update on the review? I'm keeping it up-to-date with the latest changes, but let us know if we should request another reviewer :) We could unblock @c-harish and also add the STORE/STOREDIST options.
Hi @benbarten! Thanks for all the work on this feature, looks like its close to completion. Apologies for the delayed reviews, the team is now returning from vacation and we've started reviewing the PRs again.
In the meantime could you please resolve the merge conflicts that seem to be happening?
Hey @benbarten , are you working on it actively? Asking it since read only part of command is blocked on this PR.
Hey @JyotinderSingh ,
sorry for the late response here as well. I got caught up at work after new year and missed the notification. I'm picking the work on this up again by resolving the merge conflicts. Afterwards, I'll create a follow up issue to implement this enable the STORE and STOREDIST options by making this a multi shard command.
Hey @JyotinderSingh,
I have resolved the merge conflicts. As described above, I still need to implement the STORE and STOREDIST options. Based on my understanding, they will require to make this command a multi-shard command, since both options take the command's output and store it in a new sorted set with a new key.
I will implement this in the upcoming days. In the meantime, feel free to merge this part already, as it will unblock @iamskp11 on his work.
Hey @JyotinderSingh,
I have resolved the merge conflicts. As described above, I still need to implement the
STOREandSTOREDISToptions. Based on my understanding, they will require to make this command a multi-shard command, since both options take the command's output and store it in a new sorted set with a new key.I will implement this in the upcoming days. In the meantime, feel free to merge this part already, as it will unblock @iamskp11 on his work.
Hi @benbarten, thanks a lot for picking this up again! We're in the middle of another large refactor, which has changed the architecture of the database quite a bit (and makes it a lot faster and memory efficient). You may want to catch up on somw of those changes. Thanks again!
Hey @JyotinderSingh,
I already read about the ironhawk engine on discord, but haven't had time to look at how to migrate existing commands.
I'm close to finishing the STORE/STOREDIST options in presumably the old way. Once I'm done, I'll have a look at the new engine and try to migrate the current implementation.
Hey @JyotinderSingh,
I already read about the ironhawk engine on discord, but haven't had time to look at how to migrate existing commands.
I'm close to finishing the STORE/STOREDIST options in presumably the old way. Once I'm done, I'll have a look at the new engine and try to migrate the current implementation.
awesome, thank you!
Hey @JyotinderSingh,
I have now implemented the STORE/STOREDIST options with the old engine. This made the GEORADIUSBYMEMBER command a multi-shard command, but also caused a couple of new merge conflicts due to the recent refactor. I had a quick look at the new ironhawk engine and how to port commands to it. It seems there is no way to support multi-shard commands yet.
Do you want me to implement it as part of this PR?
Hey @JyotinderSingh,
I have now implemented the
STORE/STOREDISToptions with the old engine. This made theGEORADIUSBYMEMBERcommand a multi-shard command, but also caused a couple of new merge conflicts due to the recent refactor. I had a quick look at the new ironhawk engine and how to port commands to it. It seems there is no way to support multi-shard commands yet.Do you want me to implement it as part of this PR?
Thanks for the update @benbarten. We're in the process of implementing a multi-shard strategy. I would recommend making the port once it's in place. I'll ping on this PR once it's ready.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
:white_check_mark: benbarten
:x: harish-sa
You have signed the CLA already but the status is still pending? Let us recheck it.