s2 icon indicating copy to clipboard operation
s2 copied to clipboard

Attempt to expose S2RegionTermIndexer

Open richgarner-nsw-gov opened this issue 1 year ago • 3 comments

We have a need to use S2RegionTermIndexer and this hasn't been exposed in this library.

This is an attempt to bring that in although it's very incomplete and I am currently unable to get past a runtime error caused by the line:

std::vector<std::string> result = termIndexer.GetQueryTerms(point, str);

I'm not a C++ dev nor have I used Napi or any other methods of binding C++ binaries to node.

@jkao Any suggestions on why this is happening and how to progress this are welcomed! (Nice work on this btw)

richgarner-nsw-gov avatar Dec 15 '24 23:12 richgarner-nsw-gov

Thanks for the PR! Can you talk a little bit about your use case that you're trying to achieve with this? We might be able to get away with existing primitives in the meantime while we try and find bandwidth on the team to build this out

jkao avatar Dec 16 '24 16:12 jkao

Thanks for the PR! Can you talk a little bit about your use case that you're trying to achieve with this? We might be able to get away with existing primitives in the meantime while we try and find bandwidth on the team to build this out

Sure! Thanks for the reply.

We want to see if we can integrate S2 better with our database indexes and were looking at the S2RegionTermIndexer to help us do this as described in the "Indexing for Search" section of the S2 doco http://s2geometry.io/devguide/cpp/quickstart

The comment on the actual C++ file itself actually describes this pretty well too:

https://github.com/google/s2geometry/blob/master/src/s2/s2region_term_indexer.cc

We basically just want to expose S2RegionTermIndexer::GetIndexTerms and S2RegionTermIndexer::GetQueryTerms methods to TypeScript so this can be integrated in our existing app which uses this library.

I'm happy to try to do this work, but there's a bit of a learning curve I need to overcome so any pointers would be great.

The current error I'm trying to overcome is:

// dyld[15800]: missing symbol called
// Abort trap: 6

I thought about just implementing those functions in TypeScript as they aren't that complicated, but I figured it'd be better to use the actual official implementation to avoid bugs.

richgarner-nsw-gov avatar Dec 17 '24 00:12 richgarner-nsw-gov

Thanks for the PR! Can you talk a little bit about your use case that you're trying to achieve with this? We might be able to get away with existing primitives in the meantime while we try and find bandwidth on the team to build this out

Sure! Thanks for the reply.

We want to see if we can integrate S2 better with our database indexes and were looking at the S2RegionTermIndexer to help us do this as described in the "Indexing for Search" section of the S2 doco http://s2geometry.io/devguide/cpp/quickstart

The comment on the actual C++ file itself actually describes this pretty well too:

https://github.com/google/s2geometry/blob/master/src/s2/s2region_term_indexer.cc

We basically just want to expose S2RegionTermIndexer::GetIndexTerms and S2RegionTermIndexer::GetQueryTerms methods to TypeScript so this can be integrated in our existing app which uses this library.

I'm happy to try to do this work, but there's a bit of a learning curve I need to overcome so any pointers would be great.

The current error I'm trying to overcome is:

// dyld[15800]: missing symbol called
// Abort trap: 6

I thought about just implementing those functions in TypeScript as they aren't that complicated, but I figured it'd be better to use the actual official implementation to avoid bugs.

That makes sense. I believe the index in question is for in-memory use (rather than in the DB), so you could use it to build a very fast geo service, but it might not be what you're looking for exactly.

If you're blocked on this, you can actually make use of the existing s2.RegionCoverer.getCoveringTokens method in the meantime.

The process would be something along the lines of:

# indexing - you can even keep s2LevelMin === s2LevelMax for simplicity during queries
const tokens = s2.RegionCoverer.getCoveringTokens(polygonYouWantToIndex, { min: s2LevelMin, max: s2LevelMax });

# insert the tokens into your database with s2 -> ID of entity, ideally you have an index on this column 
...

# querying - a point
const point = new s2.CellId(new s2.LatLng(Y, X));
const pointAtLevel = point.parent(min); // alternatively create a for loop from s2LevelMin to s2LevelMax for a series of tokens 

# query your DB for WHERE s2Token IN (pointAtLevel (or points at level if s2LevelMin !== s2LevelMax))

you can do something similar with a radius query. you would use the region coverer around the radius for s2LevelMin -> s2LevelMax and query all those tokens.

jkao avatar Dec 17 '24 16:12 jkao