dynamodb-geo.js icon indicating copy to clipboard operation
dynamodb-geo.js copied to clipboard

Where next?

Open robhogan opened this issue 7 years ago • 4 comments

I'm curious whether anyone values the current compatibility and API similarity with the 'official' java library? (https://github.com/awslabs/dynamodb-geo)

I think in a few ways that's holding this lib back, so I'm considering breaking data compatibility:

  • Using the first n digits of the geohash as a partition key is crude. It'd be better to use a lower level (wider area) cell ID as the hash instead, which matches up to the way we construct coverings so would offer better performance and efficiency, as well as being a bit easier to reason about.
  • We could offer the option of using global secondary indices for geo queries rather than using the table's main hash key - that'd allow you to use whatever unique ID you wanted at the table level, rather than having to know the coordinates to fetch an item.
  • Building on that, we could offer the option of having multiple GSIs at different S2 levels, so that you could perform 1000km searches against one index and 10m searches against another (potentially selected automatically). That'd be a huge improvement on RCU efficiency for anyone who has those needs.

Let me know if you have any thoughts, or if there's anything else you'd like to see.

robhogan avatar Jun 05 '17 18:06 robhogan

I prefer the option of using GSIs. I'm currently doing a post-result-filter, in my node app, that grabs the latest record of all items returned for the payload. The ability to perform the query with my limiters would be a nice win.

As far as breaking compatibility, we don't use the Java lib currently and without this lib I'd be dead in the water for this project. I believe if this lib offers performance increases while maintaining backward compatibility when possible then perhaps it will influence some changes on the AWS lib.

I don't like the idea of having the opportunity to improve (vastly) but not being able to because AWS made a less optimized table build/query routine.

Just my 2 cents although I'm sure that anybody using both libs will be disrupted. Are there any of those people out there? I dunno.

BenHavilland avatar Jun 05 '17 20:06 BenHavilland

Using multiple GSI's sounds a good way to manage range queries that vary between 1-100's of KMs, might be the north star heading forward in regards to optimising RCU efficiency.

Extending the Library to polygons may also be very useful, currently I use the S2 region coverer to approximate polygons into a series of cells, all mapping to one attribute (however it could be changed to be a range key, I'm yet to find a reason to do that though), that attribute is then the hashKey of a separate table storing data related to the particular polygon to limit redundancy on the table. This is relatively quick to do on a single object level, depending on the level of accuracy needed.

In regards to breaking compatibility, I think if there is a module available with the old Java compatible methods it'd be fine.

malz avatar Jun 05 '17 21:06 malz

Why not allow the user to manually create the HashKey?

scotttesler avatar Sep 18 '17 22:09 scotttesler

For me, compatibility with java means extremely little. I'd love some of the features you mentioned.

davidlukewilcox avatar Apr 12 '18 13:04 davidlukewilcox