api icon indicating copy to clipboard operation
api copied to clipboard

do not deduplicate different units at the same address

Open missinglink opened this issue 5 years ago • 3 comments

Multiple units at the same address are currently being deduplicated.

This can lead to situations where the response is showing much fewer results than was requested. In particular this is not a great experience for /v1/reverse but also for the forward-geocoding APIs

One thing to consider before merging this:

  • If the labels we generate don't include the unit number then the results will contain 10+ entries with the exact same label.

So the pro here is that we can show all the units in the building and the con is that these results will dominate results for queries near these multi-unit buildings/campuses.

I really wish there was something analogous to a GROUP BY in elasticsearch, but this is unfortunately not possible.

An alternative solution would be to increase the querySize variable from what it is now (usually double ?size) so that more results are returned and then continue to consider these as duplicates 🤷‍♂️

Thoughts/Feels?

missinglink avatar Jul 15 '20 13:07 missinglink

IMO, this is a complex choice and depends on the use-case.

As you said, having all units of a building will flood the result, especially when your use case is the address only.

In the other hands... When you need the unit... :man_shrugging:

I suppose, for the forward geocoding, when you are looking for the unit, all units should be displayed.

And for the reverse... :man_shrugging: maybe a meta category address:unit or dedicated query parameter ? But it may be a pain for the integration...

Joxit avatar Aug 03 '20 20:08 Joxit

I've demoted this PR from "ready-to-merge" to "draft".

This was originally written to resolve an issue one of our clients was having, they have since solved the issue themselves by performing deduplication of the data before importing it.

As noted by @Joxit, each deployment may wish to configure this differently, so we'd probably need to put it behind a config flag before merging it (if we decide to do so).

missinglink avatar Aug 04 '20 12:08 missinglink

@rowanwins, I would be fine with reopening this PR and pursuing getting it merged if you find it useful.

Since it's a dramatic change for existing users we would have to put it behind a feature flag which defaulted to being off. That means it wouldn't be available on geocode.earth for instance, but would allow developers such as yourself to continue working on unit number functionality in their local environment/own deployments.

If at a later date we could show that this change doesn't make results super noisy then we could make a subsequent PR which changes that feature flag to be on by default, pending a plus one from the core contributors, which would make it available by default on all installations.

missinglink avatar Jan 19 '21 03:01 missinglink