osmdata icon indicating copy to clipboard operation
osmdata copied to clipboard

Autosplit large queries

Open Mashin6 opened this issue 2 years ago • 3 comments

Currently if query times out or runs out of memory the request is terminated. Given that the amount of data in OSM is growing it might be worth thinking about how to help user in these situations. A possibility is that if a query fails, the bbox is split in the middle into four equal rectangles and each are submitted one after another. If any of them fails then it is recursively split until the correct size is reached when the query successfully retrieves data.

Similarly as here: https://github.com/ZeLonewolf/osm-overpass-scripts/blob/7bf5ff10948f50ee5f1e33b08a9d0852274094fb/get_tag_density_map.sh#L139

Notes:

  • Might be good to ask user before proceeding with bbox splitting
  • Needs a way how to handle possible duplicates retrieved at the bbox borders.

Mashin6 avatar Nov 30 '21 02:11 Mashin6

Thanks @Mashin6, I've actually got code elsewhere that does exactly that. The handling of duplicates is already in-built via the c operator; you just have to expand each bbox part so that there is some non-zero overlap, then combining all with c will automatically de-duplicate everything. The real issue here (in my opinion at least) is that enabling that would extend an open invitation for people to really start abusing the overpass server. And so: my interim suggestion would be, at least initially, to add an extra vignette on the general procedure. Would you be interested in dong that?

mpadge avatar Nov 30 '21 08:11 mpadge

I've never wrote a vignette, but I could give it a try.

By general procedure you mean how to split bbox, run several separate queries and then manually merge the results into one?

A solution to the extensive server load could be restricting this type of querying to kumi OP only. They don't really have any restrictions about usage. As a warning, kumi has broken attic data so it is not good for using queries that contain datetime = or datetime2 =

Mashin6 avatar Dec 04 '21 06:12 Mashin6

@Mashin6 yes, a vignette on splitting a bbox, running separate queries, and merging results is exactly what i mean. I've started a generic vignette template in #262 - feel free to extend however you like. If you do contribute, please make sure you add your name both to the author field of that vignette and to the DESCRIPTION file as ctb. Thanks!

mpadge avatar Jan 03 '22 13:01 mpadge