pip-service
pip-service copied to clipboard
Configuration to limit the number of layers loaded on start
Use-cases
Starting up the pip-service can take an extremely long amount of time even when limiting what you want via importPlace. Sometimes certain layers aren't needed & it would be nice to disable the loading of them to improve that startup time. From what I've seen locality and localadmin in particular take much longer then the other layers.
Proposal
Pass in the layers you want loaded into here https://github.com/pelias/pip-service/blob/master/app.js#L95
As far as I can tell, this feature is already supported in wof-admin-lookup:
- https://github.com/pelias/wof-admin-lookup/blob/452932cc692eb5d0cf723d0659c1a99d9da2ed47/index.js#L26-L29
- https://github.com/pelias/wof-admin-lookup/blob/d25718e004ce949cd3a3018fad7f13f27e3da086/src/localPipResolver.js#L14-L17
- https://github.com/pelias/wof-admin-lookup/blob/d25718e004ce949cd3a3018fad7f13f27e3da086/src/pip/index.js#L41-L44
I'm happy to ~~implement this myself~~ file a PR, I just need to know what the procedure for updating the config is since it's shared across all projects for pelias + since this project currently doesn't load the config.
Hi @CharlesG-Branch, we have been working on a new system which starts up instantly, would you be interested in BETA testing that?
If you were to remove locality from the list of layers this would have a negative effect on quality since address data would no longer be associated with a locality. Could you please explain more about your specific use-case that this wouldn't matter?
@missinglink I'd be happy to beta test it & I'm curious how that was accomplished (is there a runtime perf hit?)
In my case I'm only deploying this service without the rest of the pelias stack as I only need the reverse geocoding component. And only the layers for counties and larger are important for my case & so I figured that locality would be safe to not load then as it's lower in the hierarchy https://github.com/whosonfirst/whosonfirst-placetypes — will not loading it impact the accuracy of things higher in the hierarchy?
Well then you're going to love this...
curl -s https://data.geocode.earth/wof/dist/spatial/whosonfirst-data-admin-us-latest.spatial.db.bz2 | lbunzip2 > whosonfirst-data-admin-us-latest.spatial.db
docker run --rm -it -v "${PWD}:/data" -p 3000:3000 pelias/spatial server --db=/data/whosonfirst-data-admin-us-latest.spatial.db
There is a demo on port 3000
Try out these paths locally:
GET /explore/pip#14/37.785240/-122.424624
GET /query/pip?lon=-122.42457937449218&lat=37.78471707419765&role=boundary
GET /query/pip/verbose?lon=-122.42457937449218&lat=37.78471707419765&role=boundary
GET /query/pip/_view/pelias/-122.42457937449218/37.78471707419765
With the last of these being a 'reverse compatible' endpoint with this repo, although that's where the BETA comes in. I would appreciate your feedback.
The magic here is that the data is loaded in mmap mode so the linux filesystem cache provides an in-memory LRU cache for the 'hot pages', you don't need to configure anything but the more memory you have the moar faster it is, I can explain more if you find it useful.
Wow, the startup time & demo page are incredible. Exposing the localization information is also extremely helpful.
I did get some exceptions for the last two links:
2020-04-09T19:44:59.758Z - info: [geometry] ::ffff:172.17.0.1 - GET /query/pip/_view/pelias/-122.42457937449218/37.78471707419765 HTTP/1.1 500 1018 - 17.145 ms
TypeError: Cannot read property 'split' of null
at rows.forEach.row (/code/server/routes/pip_verbose.js:29:33)
at Array.forEach (<anonymous>)
at Object.module.exports (/code/server/routes/pip_verbose.js:28:8)
at module.exports (/code/server/routes/pip_pelias.js:14:33)
at Layer.handle [as handle_request] (/code/node_modules/express/lib/router/layer.js:95:5)
at next (/code/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/code/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/code/node_modules/express/lib/router/layer.js:95:5)
at /code/node_modules/express/lib/router/index.js:281:22
at param (/code/node_modules/express/lib/router/index.js:354:14)
2020-04-09T19:45:11.456Z - info: [geometry] ::ffff:172.17.0.1 - GET /query/pip/verbose?lon=-122.42457937449218&lat=37.78471707419765&role=boundary HTTP/1.1 500 1033 - 23.600 ms
TypeError: Cannot read property 'split' of null
at rows.forEach.row (/code/server/routes/pip_verbose.js:29:33)
at Array.forEach (<anonymous>)
at module.exports (/code/server/routes/pip_verbose.js:28:8)
at Layer.handle [as handle_request] (/code/node_modules/express/lib/router/layer.js:95:5)
at next (/code/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/code/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/code/node_modules/express/lib/router/layer.js:95:5)
at /code/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/code/node_modules/express/lib/router/index.js:335:12)
at next (/code/node_modules/express/lib/router/index.js:275:10)
I'll play around with it loading + using the full wof dataset later today. Limiting the placeids loaded (currently done with imports.whosonfirst.importPlace) might still be useful since it'll prevent unneeded places from filling up the cache — tho the cost from failing to find may just be more. I'll have to check.
Looks like a bug thanks, easily fixed. I'm opening up https://github.com/pelias/spatial/issues/47 for further feedback, please add any more beta testing notes over there so I can track them in one place.
More download options from our site https://geocode.earth/data
If y'all would like commercial support we'd be happy to supply other data such as OSM and US CENSUS data for your business as seen in our demo https://spatial.demo.geocode.earth/explore/pip
bug resolved in https://github.com/pelias/spatial/pull/48
FWIW I came to this issue after having serious performance issues starting pip-service in development (using the shipped Docker image.) Spatial does the trick and starts immediately. I generally find the documentation on what datasets are applicable to which products and why, and the proper way to import really confusing. BUT, with the examples in issues and by reading the code I was able to make it work. Thanks for making these projects open-source. I would recommend anyone needing PIP to go straight to Spatial.