gabbar
gabbar copied to clipboard
Prototyping Gabbar for highway features
One of the popular problems in machine learning is dogs vs cats; given a picture predict whether the picture is of a dog or a cat. Coming from this initial experience about machine learning, I kept thinking the problem of classification of changesets as good or problematic is something similar. But, today I did an exercise where I wanted to identify one attribute about the changeset that makes it good or problematic. I started with:
- https://osmcha.mapbox.com/49563062/
-
highway=residential
is modified tohighway=unclassified
data:image/s3,"s3://crabby-images/683a3/683a3bcbc306cb6e5a0de097d48685c55b99e6b8" alt="screen shot 2017-06-16 at 9 15 25 am"
The following questions came to mind
- What could be the source of knowledge to modify?
- Isn't
residential
better thanunclassified
; I mean something is better than nothing right? - At version
15
, this is quite a mature feature. So, is that alright? - What is the length of the highway; smaller should be residential and longer unclassified?
- Why is
source=google maps
Really?
From https://wiki.openstreetmap.org/wiki/Key:highway
- highway=unclassified
The least most important through roads in a country's system – i.e. minor roads of a lower classification than tertiary, but which serve a purpose other than access to properties. Often link villages and hamlets.
- highway=residential
Roads which serve as an access to housing, without function of connecting settlements.
From https://osmlab.github.io/osm-deep-history/#/way/103217436
- The feature has mostly been
highway=unclassified
since creation in 2011.
data:image/s3,"s3://crabby-images/f6182/f6182a9c2c765a2029384c48e9a570de50a06eda" alt="screen shot 2017-06-16 at 9 19 59 am"
Looking deeper into other changesets where a highway=residential
gets modified into highway=unclassified
, I find this user, Порфирий
who has lots of changesets with the same behavior. Interestingly, the user who added highway=residential
is Порфирий
too.
- https://www.openstreetmap.org/user/Порфирий/history
data:image/s3,"s3://crabby-images/cda59/cda593933a502f5217971347d0d90d66c5493c42" alt="screen shot 2017-06-16 at 9 30 27 am"
Eureka!
When a highway modification has so many questions to answer and attributes to look at, what will the scale be when we look at all 26 primary tags together? What about features that don't have any primary tags? Too many questions! Too many attributes! Right?
- This does not look a traditional cats vs dogs. It is a little something else.
- How about we try something different? How about we build one machine learning model for each object type?
- How would it look when there is a model trained on highway's to classify whether the new/modified highway is a :thumbsup: or a :thumbsdown:
- Another trained on buildings, another in water bodies, etc and each knew what a good highway looks like and a problematic highway looks like?
- Is this it?
cc: @anandthakker @geohacker @batpad