geoplanet icon indicating copy to clipboard operation
geoplanet copied to clipboard

GeoPlanet seems to be dead, you might not know from using this Gem

Open dramalho opened this issue 9 years ago • 17 comments

So, while I can't find proper significant information on this, it seems like the GeoPlanet API endpoints are gone, as in, the hosts are no longer present in the DNS records.

Depending on how you're using this gem you may or may not have picked on this error, i.e.

> GeoPlanet::Place.new(2423945)
Yahoo GeoPlanet: GET http://where.yahooapis.com/v1/place/2423945?format=json&appid=xxxxxxxxxxxxxx-
SocketError: getaddrinfo: nodename nor servname provided, or not known

here you would have known, but if you're doing searches, not so much

> GeoPlanet::Place.search("bla")
Yahoo GeoPlanet: GET http://where.yahooapis.com/v1/places.q('bla')?format=json&appid=xxxxxxxxxxxxxx
=> nil
irb(main):008:0>

because all errors get rescued inside the gem!

ANYHOO .. seems like we can say our goodbyes to Yahoo! GeoPlanet , I'm posting this as a placeholder for other people that might be as confused as I was and maybe as a place to discuss sensible alternatives?

¯_(ツ)_/¯

dramalho avatar Aug 31 '16 11:08 dramalho

@dramalho found this topic through the github global search. Looks like twitter has an alternative end-point, that you can query to get the woeid result: https://dev.twitter.com/rest/reference/get/trends/closest

dmitry avatar Sep 02 '16 14:09 dmitry

of sorts, they don't seem to return bounding boxes and from the documentation, they're pushing you to where.yahooapi.com URL's which are dead now :) . Interesting to know how this is affecting Twitter clients

dramalho avatar Sep 02 '16 14:09 dramalho

As I know the latest dump of the WOE is here: https://archive.org/details/geoplanet_data_7.10.0.zip

Could be downloaded, imported and used with an internal queries.

But the data is outdated, and as alternative I can see OSM + geonames data.

dmitry avatar Sep 02 '16 15:09 dmitry

Seems like a solution in PHP: https://github.com/twbell/GPLplanet to query SQL dataset.

dmitry avatar Sep 02 '16 15:09 dmitry

I've been trying OSM via the Overpass QL stuff, bit of a learning curve there.

Geonames (gem here) is more straight to the point GeoPlanet replacement, the children endpoint is somewhat limited and data seems poorer .

dramalho avatar Sep 02 '16 15:09 dramalho

Digging in Overpass + MapIt: http://global.mapit.mysociety.org/ At the moment MapIt is the best solution I found out there, to replace geoplanet with.

Also this one seems interesting: https://github.com/whosonfirst-data/whosonfirst-data

Especially the sources they are listed in their readme file:

  • gn - GeoNames (CC-BY)
  • gp - GeoPlanet Where on Earth, aka woe (CC-BY)
  • ne - Natural Earth (Public Domain)
  • oa - OurAirports (Public Domain)
  • qs - Quattroshapes (CC-BY)
  • zs - Zetashapes (CC-BY)

PS. https://github.com/whosonfirst/whosonfirst-placetypes list of placetypes

dmitry avatar Sep 02 '16 15:09 dmitry

Looks like WOE is available via http://query.yahooapis.com/v1

dmitry avatar Sep 02 '16 20:09 dmitry

Uh oh...

I can't see that http://query.yahooapis.com/v1 was available neither. Looks like Yahoo definitively decided to turn off the API.

If the integration with this service is completely necessary, the best alternative I can think of is to use the archive data (with Creative Commons license) as @dmitry suggested. Unfortunately, this will end up on a quite big refactor of the gem. Reading the TSV files in memory when they occupy almost a gigabyte is obviously not a good solution. Ideally, this data should be imported at least to a sqlite3 database that the gem could bring with it, so the info can be queried in runtime efficiently. The GPLplanet PHP library sounds like a good reference for the import process and querying.

I'm afraid that I don't have enough time at this moment for such a big refactor, but I will consider it in the future. Maybe some of you can take the opportunity to rewrite this as a new gem?

carlosparamio avatar Sep 05 '16 06:09 carlosparamio

@carlosparamio it's on, but you should pass a correct query:

https://developer.yahoo.com/yql/console/?q=select%20%2A%20from%20weather.forecast%20where%20woeid%20in%20(select%20woeid%20from%20geo.placefinder%20where%20text%3D%2217.4186947%2C78.5425238%22%20and%20gflags%3D%22R%22)#h=select+*+from+geo.places+where+text%3D%22sfo%22

http://query.yahooapis.com/v1/public/yql?q=select * from geo.places where text ='hamburg*'

dmitry avatar Sep 05 '16 08:09 dmitry

Ok, that's a different API then, which uses YQL instead of the "places" endpoint. It should certainly be easier to adapt the gem to this other querying schema.

carlosparamio avatar Sep 05 '16 08:09 carlosparamio

so, re: OSM

Great service, the query language is mega powerful, the data / metadata contents and structure vary way too much between places. Let me give you an example:

This is one (very crude simple way) to get child elements from London, i.e., with this you could easily get suburbs and similar administrative areas based on the main "London" relation . You can do the same say, for Lisbon (Portugal), and again, mind you this query can be further optimized, but bear with me.

Now, other places are harder, for instance take Dublin and straight away you'll notice it's missing child nodes like we have to for the other places, turns out data is structured different in Dublin (possibly Ireland) . We can do something else instead and do a query based on the bounding box of Dublin, which I'll link to as soon as I'm no longer being throttled by the overpass server, but the TL;DR; is that we do get a bunch of stuff, but specifically for Dublin, the suburb names suck as they were added from Electorate information and all the names have ED in them - which makes automating this sort of thing for our purposes pretty much sucky sucky - also, hierarchy is a bit ambiguous, we don't know for a fact that a given location in Dublin's bounding box is actually in Dublin http://global.mapit.mysociety.org/point/4326/-6.42676900000004,53.397187.html.

:P

I'll keep digging, just want to add my observations into the bucket here :)

dramalho avatar Sep 05 '16 14:09 dramalho

also checked the static geoplanet database, no bounding boxes, no lat/lon for places, no actual geographical information!

dramalho avatar Sep 05 '16 15:09 dramalho

re: Using YQL, not everything is available as they seem to make calls to their own endpoints, for instance querying a given place descendants fails

https://developer.yahoo.com/yql/console/?q=select%20*%20from%20geo.places.descendants%20where%20ancestor_woeid=%27560743%27&debug=true

"url": {
    "error": "Connect Failure",
    "execution-start-time": "3",
    "execution-stop-time": "11",
    "execution-time": "8",
    "content": "http://wws.geotech.yahooapis.com/v1/place/560743/descendants;start=0;count=1000"
   },

dramalho avatar Sep 05 '16 15:09 dramalho

@dramalho thanks for sharing!

PS. Working at the moment on the service that responds with hierarchy of boundaries from OSM source (through overpass API). Stay tunned, I will post a link to the application source here (powered by sinatra and rgeo).

dmitry avatar Sep 05 '16 19:09 dmitry

@dmitry yep, like I said, of the anecdotical evidence I gathered you'll need at least two strategies, recursively fetch the descendents (and ascendents) OR finding relations that fit a given bounding box. Unless there's a better way of course, I have very early stage knowledge on OSM and Overpass QL :)

dramalho avatar Sep 05 '16 20:09 dramalho

@dramalho finished to work on the API service, that loads and processes information from the overpass API: https://github.com/dmitry/geo_pointer

Comments are necessary and helpful.

dmitry avatar Sep 06 '16 19:09 dmitry

@dmitry I'll take a look for sure

dramalho avatar Sep 07 '16 10:09 dramalho