gabbar icon indicating copy to clipboard operation
gabbar copied to clipboard

Prototyping Gabbar for highway features

Open bkowshik opened this issue 7 years ago • 9 comments

One of the popular problems in machine learning is dogs vs cats; given a picture predict whether the picture is of a dog or a cat. Coming from this initial experience about machine learning, I kept thinking the problem of classification of changesets as good or problematic is something similar. But, today I did an exercise where I wanted to identify one attribute about the changeset that makes it good or problematic. I started with:

  • https://osmcha.mapbox.com/49563062/
  • highway=residential is modified to highway=unclassified
screen shot 2017-06-16 at 9 15 25 am

The following questions came to mind

  • What could be the source of knowledge to modify?
  • Isn't residential better than unclassified; I mean something is better than nothing right?
  • At version 15, this is quite a mature feature. So, is that alright?
  • What is the length of the highway; smaller should be residential and longer unclassified?
  • Why is source=google maps Really?

From https://wiki.openstreetmap.org/wiki/Key:highway

  • highway=unclassified

The least most important through roads in a country's system – i.e. minor roads of a lower classification than tertiary, but which serve a purpose other than access to properties. Often link villages and hamlets.

  • highway=residential

Roads which serve as an access to housing, without function of connecting settlements.

From https://osmlab.github.io/osm-deep-history/#/way/103217436

  • The feature has mostly been highway=unclassified since creation in 2011.
screen shot 2017-06-16 at 9 19 59 am

Looking deeper into other changesets where a highway=residential gets modified into highway=unclassified, I find this user, Порфирий who has lots of changesets with the same behavior. Interestingly, the user who added highway=residential is Порфирий too.

  • https://www.openstreetmap.org/user/Порфирий/history
screen shot 2017-06-16 at 9 30 27 am

Eureka!

When a highway modification has so many questions to answer and attributes to look at, what will the scale be when we look at all 26 primary tags together? What about features that don't have any primary tags? Too many questions! Too many attributes! Right?

  • This does not look a traditional cats vs dogs. It is a little something else.
  • How about we try something different? How about we build one machine learning model for each object type?
  • How would it look when there is a model trained on highway's to classify whether the new/modified highway is a :thumbsup: or a :thumbsdown:
  • Another trained on buildings, another in water bodies, etc and each knew what a good highway looks like and a problematic highway looks like?
  • Is this it?

cc: @anandthakker @geohacker @batpad

bkowshik avatar Jun 16 '17 05:06 bkowshik