wttr.in
wttr.in copied to clipboard
JSON `nearest_area` is one city over
Todo
- [ ] Add
queried_location
to JSON response
Details
A while ago I submitted a PR to add includeLocation
to the weather query in order to get nearest_area
in the JSON result. This had been working flawlessly and fantastically up until about a month ago or so.
Now the areaName
field is consistently one or two cities away instead of the city being queried. For example, los angeles
gives results for hillgrove, california
. new york city
gives oakland gardens, new york
. london, uk
gives cubitt town, tower hamlets, greater london, united kingdom
. A month ago all of these used to return the expected city.
This is disconcerting because I cannot tell if WWO (wttr's weather backend) simply no longer has access to weather stations in those cities or if its database look-up is broken in some way or if the nearest_area
has been changed to do something else now.
Do you know why this is happening?
By the way, I was looking over the PRs and I saw that somebody had recently submitted a PR to use an IP-location service because the WWO location results have been very inaccurate lately. Instead, I hope we can resolve the WWO location result issue to be accurate once again.
I can confirm this behaviour. Searching for some larger German cities like München
returns Strasslach-Dingharting
, some village in the vague surrounding area, Karlsruhe
returns Neuburg Am Rhein
, same thing.
Yes, I can confirm this too. I think that the problem is that how WWO handles it. I see that it is very inaccurate now. I should report this problem to them, and if would not be fixed/mitigated we should probably look for a new data provider.
Internal wttr.in location resolution works perfectly well:
$ curl http://localhost:8004/Nuremberg
{"latitude": 49.453872, "timezone": "Europe/Berlin", "longitude": 11.077298, "address": "Nürnberg, Mittelfranken, Bayern, Deutschland"}
$ curl http://localhost:8004/49.45,11.08
{"address": "11, Theatergasse, Altstadt, St. Lorenz, Nürnberg, Bayern, 90402, Deutschland", "latitude": 49.4498919, "longitude": 11.0801459, "timezone": "Europe/Berlin"}
But at the same time:
$ curl wttr.in/Nuremberg?format=j1 | jq .nearest_area[0]
{
"areaName": [
{
"value": "Aurau"
}
],
"country": [
{
"value": "Germany"
}
],
"latitude": "49.250",
"longitude": "11.017",
"population": "0",
"region": [
{
"value": "Bayern"
}
],
"weatherUrl": [
{
"value": ""
}
]
}
We can override the nearest_area
field of WWO with the wttr.in data,
but the real question is that perhaps WWO returns the data for the nearest_area
instead of the area in the query (which would be really bad)
We can override the nearest_area field of WWO with the wttr.in data, but the real question is that perhaps WWO returns the data for the nearest_area instead of the area in the query (which would be really bad)
It does look like the weather data is indeed "accurate" for the nearest_area
. The problem is that it's not the location we searched for.
@pragma- I think the only real solution for this problem is to add support for other upstream data sources. We have initial support of a new data source in #532; I believe more will follow; then we will have a robust solution, and until that we will be always dependent on the single data source
Could this be also/additionally due to a service rounding coordinates?
When I use http://wttr.in/51.4976,20-0.1181
(central London), I get the following search result: Ort: Lambeth Palace Garden, Lambeth Palace Road, Lambeth, London Borough of Lambeth, London, Greater London, England, SE1 7JU, United Kingdom [51.49704725,-0.11875235545073382]
If I do the same search with JSON format like this http://wttr.in/51.4976,%20-0.1181?format=j1
, I do get a different output:
Mark the request
coordinates being only two decimals.
I use the forecast module with Bodhi Linux which uses this as a backend, and have the same issue. If I set it to San Jose, California it comes up with Coyote, someplace in the remote surrounding area. I tried entering other various city names around me and Cupertino came up with Austin which is a little bit closer, but no way to get it to actual San Jose that I have found.
Ideally I would enter a postal/zip code.... but even if I had to enter latitutde/longtitude that would be fine... but city name is not working quite right.
Both of those locations (Coyote or Austin) are small obscure places I had not heard of, and had to use google maps to even find them. I reported to Bodhi developers but they pointed me here as an upstream problem, and seems it is affecting others in similar manner, when I read "village in remote surrounding area" for the user near Munich I thought to myself "yep, exactly!".
I've localized the bug pretty well now. As I already wrote before, it is in the data source. I hope they will fix it, because it is a real bug, affecting all their (commercial) customers. If they will not fix it, I have an idea of a workaround, and if it will not help either, the only solution will be to change the data source.
Just for the clarity: it is not a bug in wttr.in!
@pragma- @ScientiaEtVeritas @Danfro @enigma9o7
I believe it is fixed now. Could please check if it works for you?
@chubin It doesn't seem fixed for me. I'm using this endpoint: http://wttr.in/Karlsruhe?format=j1
. Thank you for looking into this issue!
@ScientiaEtVeritas Doch,
at least it seems to work for me (with Karlsruhe too):
$ curl -ks wttr.in/Karlsruhe?format=j1\&nonce=$RANDOM | jq -r .nearest_area[0].areaName[0].value
Carlsruhe
I added here nonce=
, to bypass the caching layer (shouldn't be done usually, because it generates additional useless load, but ok in this case; as soon as the cache entries are expired, it will be not needed here too)
@pragma- @ScientiaEtVeritas @Danfro @enigma9o7
I believe it is fixed now. Could please check if it works for you?
It does! This is excellent! Thank you so much!
I think the bug is fixed; let's wait for at least one additional acknowledgment (@ScientiaEtVeritas from Fabian maybe?) and close it
Please ignore me if I just don't remember a detail of how different result formats work. But searching for say Leipzig using general search returns Leipzig as result. Fine. But using json format does return Stunz, a part of Leipzig. Is that intended? Should both return the same result = Leipzig?
Please compare those two querys:
http://wttr.in/leipzig?format=j1
Doing the same for München returns München and Gern (a part of München).
Yes, that's true, but the discrepancy shouldn't be too big (if at all). There are some locations indeed (Leipzig is one of them) where reverse GPS resolution (GPS -> Name) returns a little bit different result than the direct resolution (Name -> GPS). As far as I can understand, this comes from the caching mechanisms that are used on the data source side; we can't influence it directly.
As long as it is only slightly off, I think the error can be ignored. It it will influence the forecast results, we will need to search for some solution
@chubin The nearest_area
field does appear to be now be populated with more-accurate values, for the most part.
Previously, I was always getting city names that were one city away or so (e.g. "los angeles, california" would display "hillgrove, california"). Every time, consistently. Now I get the expected city information most of the time.
There are still some queries that do not have the expected city name; i.e. "Manhattan, New York" gives "Clason Point, New York" -- which seems to be just slightly outside of Manhattan, according to Google Maps. "Bronx, New York" gives "West Farms, New York".
It is my understanding that the data source gets information about the nearest weather station to a query. It may not always be possible to have a weather station in the exact location. That could explain why it says "Clason Point" and "West Farms" instead of the queried city name.
As long as the nearest_area
field is accurately representing the correct weather station, I am fine with discrepancy between the queried location name and the weather results location name. As far as I can tell, the nearest_area
field is much less broken now. The New York results make me hesitate on saying that it is 100% fixed.
Noticed something weird.
If I query for "Bronx" I get "Baychester, New York" with a Lat/Long of 40.86 and -73.84.
If I query for "Bronx, New York" I get "West Farms, New York" with a Lat/Long of 40.85 and -73.88.
Do you know why this happens? I would expect "Bronx" and "Bronx, New York" to both use the same weather station.
Yes, it happens because that is how the location resolution procedure works:
-
Bronx
=>Bronx County, NYC, New York, United States of America [40.85703325,-73.8366961598775]
-
Bronx,New York
=>The Bronx, Bronx County, New York, United States [40.8466508,-73.8785937]
You can query any other location, and check how it will be resolved, like this:
$ curl wttr.in/~Bronx,New+York | grep ^Location:
This problem (if it is a problem) is not related to the original one, and it is not related to weather data, it happens one step earlier. That's just like geo location system works, and I don't see here a big problem. The same could happen if you would search for a location in Google Maps or Apple Maps or whreever.
The original problem was a real problem though. It is not really because of weather station locations, because the data of the stations is getting postprocessed, interpolated etc, but it is still a bug (or caching issue) on the data source level. We can't influence it directly, but as I said, if the problem (at its older scale) reoccurs, we will search for some solution
You can query any other location, and check how it will be resolved, like this:
$ curl wttr.in/~Bronx,New+York | grep ^Location:
This indeed does say "The Bronx, Bronx County, New York" as expected! This is what I was expecting the nearest_area
field to accomplish.
Instead, today, using curl wttr.in/~Bronx,New+York?format=j1
, we have yet another new location name for "Bronx, New York"! It is now saying "Morrisania, New York". I cannot use the nearest_area
field to display the names of the locations because they seem to be confusing and inconsistent locations: Baychester, West Farms, Morrisania.
The nearest_area
field does seem to be much more accurate now, but it does not give a consistent location name for some locations. Would it be possible to add a location
field to the JSON (format=j1
) results that will use the Location:
data from the "normal" results (curl wttr.in/~Bronx,New+York | grep ^Location:
)?
Yes, it is a good idea; probably we should just add something like queried_location
to the JSON response; keep in mind though that the data is provided for the Lat/Long pair in the response, not the lat/long pair in the query! I understand that it sound strange, but that's how the caching of our data provider works, and it does not look like that they are going to fix it. Ans as I said, the shift is not so big now, much better than before
queried_location
sounds great. Should I go ahead and close this issue and open a new issue for queried_location
or do you want to keep this one open?
No, you shouldn't; I a going to work on it as a part of this issue. I already extended the original description with this step