osm2pgrouting Question about how to load alternate names

I am not sure if this is possible, but I want to load alternate names from the way tags in the OSM file.

For example, if I load the California data into postgres using osmosis, I see:

select id,tags from ways where id = 10676898;
    id    |                                                                                                                                                               
                                                                           tags                                                                                           

----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
 10676898 | "name"=>"South Spruce Road", "name_1"=>"14th Avenue East", "highway"=>"secondary", "tiger:cfcc"=>"A41", "tiger:tlid"=>"202821174", "tiger:county"=>"Tulare, CA
", "tiger:source"=>"tiger_import_dch_v0.6_20070809", "tiger:reviewed"=>"no", "tiger:zip_left"=>"93221", "tiger:name_base"=>"Spruce", "tiger:name_type"=>"Rd", "tiger:separ
ated"=>"no", "tiger:name_base_1"=>"14th", "tiger:name_type_1"=>"Ave", "tiger:name_direction_prefix"=>"S", "tiger:name_direction_suffix_1"=>"E"
(1 row)

Way 10676898 is called "South Spruce Road", but it also has an alternate name of "14th Avenue East". There are many examples like this (especially in Tulare County, CA for some reason).

Anyway, with osmosis and with ogr2ogr, all of the tags can be examined, whether or not they are loaded into a distinct column like "name". They use the hstore column type. I find this very useful when looking for streets. Using the above example, my source data might have either 14th Ave or Spruce Rd. If I can't find the one, I can also check through alternate names in the tags.

Is there a way to get osm2pgrouting to store all of the tags?

Sep 16 '15 19:09 jmarca

To get the rest of the tags, you can always make a Join with the an original data table using the osm id. importing all tags would make a bad design on the data base.

On Thu, Sep 17, 2015 at 4:34 AM, James Marca [email protected] wrote:

I am not sure if this is possible, but I want to load alternate names from the way tags in the OSM file.

For example, if I load the California data into postgres using osmosis, I see:

select id,tags from ways where id = 10676898; id | tags

----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------

10676898 | "name"=>"South Spruce Road", "name_1"=>"14th Avenue East", "highway"=>"secondary", "tiger:cfcc"=>"A41", "tiger:tlid"=>"202821174", "tiger:county"=>"Tulare, CA ", "tiger:source"=>"tiger_import_dch_v0.6_20070809", "tiger:reviewed"=>"no", "tiger:zip_left"=>"93221", "tiger:name_base"=>"Spruce", "tiger:name_type"=>"Rd", "tiger:separ ated"=>"no", "tiger:name_base_1"=>"14th", "tiger:name_type_1"=>"Ave", "tiger:name_direction_prefix"=>"S", "tiger:name_direction_suffix_1"=>"E" (1 row)

Way 10676898 is called "South Spruce Road", but it also has an alternate name of "14th Avenue East". There are many examples like this (especially in Tulare County, CA for some reason).

Anyway, with osmosis and with ogr2ogr, all of the tags can be examined, whether or not they are loaded into a distinct column like "name". They use the hstore column type. I find this very useful when looking for streets. Using the above example, my source data might have either 14th Ave or Spruce Rd. If I can't find the one, I can also check through alternate names in the tags.

Is there a way to get osm2pgrouting to store all of the tags?

— Reply to this email directly or view it on GitHub https://github.com/pgRouting/osm2pgrouting/issues/117.

Georepublic UG & Georepublic Japan Mail: [email protected]: https://georepublic.info

Sep 16 '15 21:09 cvvergara

So you mean run the import step twice? Once using (say) osmosis, and then again with osm2pgrouting?

The reason I asked the question is that I was actually trying to avoid doing that.

I'm not trying to be argumentative, but I'm not sure how adding one more column "tags" of type "hstore" will create a bad design. As of the latest osm2pgrouting, I see this in the "ways" table:

crs_small_routing=# \d ways
                                        Table "public.ways"
      Column       |           Type            |                     Modifiers                      
-------------------+---------------------------+----------------------------------------------------
 gid               | bigint                    | not null default nextval('ways_gid_seq'::regclass)
 class_id          | integer                   | not null
 length            | double precision          | 
 length_m          | double precision          | 
 name              | text                      | 
 source            | bigint                    | 
 target            | bigint                    | 
 x1                | double precision          | 
 y1                | double precision          | 
 x2                | double precision          | 
 y2                | double precision          | 
 cost              | double precision          | 
 reverse_cost      | double precision          | 
 cost_s            | double precision          | 
 reverse_cost_s    | double precision          | 
 rule              | text                      | 
 one_way           | integer                   | 
 maxspeed_forward  | integer                   | 
 maxspeed_backward | integer                   | 
 osm_id            | bigint                    | 
 source_osm        | bigint                    | 
 target_osm        | bigint                    | 
 priority          | double precision          | default 1
 the_geom          | geometry(LineString,4326) | 
Indexes:
    "ways_pkey" PRIMARY KEY, btree (gid)
    "ways_gdx" gist (the_geom)
    "ways_source_idx" btree (source)
    "ways_source_osm_idx" btree (source_osm)
    "ways_target_idx" btree (target)
    "ways_target_osm_idx" btree (target_osm)

Sep 16 '15 22:09 jmarca

Suppose you have a street that has 1000 intersections. then 999 segments would be created for routing, and the information that could be stored in one row of a table would be repeated in 999 rows of the routing table. We have being discussing if having an hstore column for the rest of the information would add some value to the routing functionality, like for example: pgr_dijkstra('select id source, target, cost, reverse_cost from edge_table', 20, 30) Does not even require a name of a segment, and doesn't even require the geometry of the topology, The name does not add value to the routing functionality, the geometry is usually used to generate the topology. The focus of the routing table is to provide the necessary information to do routing, but not to do rendering.

On Thu, Sep 17, 2015 at 7:04 AM, James Marca [email protected] wrote:

So you mean run the import step twice? Once using (say) osmosis, and then again with osm2pgrouting?

The reason I asked the question is that I was actually trying to avoid doing that.

I'm not trying to be argumentative, but I'm not sure how adding one more column "tags" of type "hstore" will create a bad design. As of the latest osm2pgrouting, I see this in the "ways" table:

crs_small_routing=# \d ways Table "public.ways" Column | Type | Modifiers -------------------+---------------------------+---------------------------------------------------- gid | bigint | not null default nextval('ways_gid_seq'::regclass) class_id | integer | not null length | double precision | length_m | double precision | name | text | source | bigint | target | bigint | x1 | double precision | y1 | double precision | x2 | double precision | y2 | double precision | cost | double precision | reverse_cost | double precision | cost_s | double precision | reverse_cost_s | double precision | rule | text | one_way | integer | maxspeed_forward | integer | maxspeed_backward | integer | osm_id | bigint | source_osm | bigint | target_osm | bigint | priority | double precision | default 1 the_geom | geometry(LineString,4326) | Indexes: "ways_pkey" PRIMARY KEY, btree (gid) "ways_gdx" gist (the_geom) "ways_source_idx" btree (source) "ways_source_osm_idx" btree (source_osm) "ways_target_idx" btree (target) "ways_target_osm_idx" btree (target_osm)

— Reply to this email directly or view it on GitHub https://github.com/pgRouting/osm2pgrouting/issues/117#issuecomment-140908263 .

Georepublic UG & Georepublic Japan Mail: [email protected]: https://georepublic.info

Sep 16 '15 22:09 cvvergara

I would agree with Vicky on this point. When I load data from other sources I have a table of the rawdata with all the original tags and then create a topology table that I use for routing. Same basic design that osm2pgrouting is using for the topology. Then when I need the other data associated with a result, or want to filter the topology edges based on some data that is not in the topology table a simple join based on the edge id is fast and easy.

Sep 17 '15 00:09 woodbri

I'm torn a bit between the two camps. I thought osm2pgrouting loaded the planet ways but seems it doesn't. It would be nice if it brought in the tags as a separate table -- which is what I originally thought osm_way_tags was for, but appears doesn't have that. I think that would largely solve the issue because then as you said - you'd just need to do a join between the osm_ways and the pgrouting ways to get the tags. It's not that elegant to have to run two separate routines that scan the same dataset.

Sep 17 '15 02:09 robe2

I would be ok with that approach, ie: add an option to create an additional separate table to avoid having to run two separate imports. It makes sense from the point of view that most people that do an import for routing are probably building a larger application and will need the other information after the routing is done. And it makes sense to put it in a separate table so the topology table is small and fast for routing and caching pages.

Sep 17 '15 03:09 woodbri

Yap that was what I was thinking and for ease of use so people only need to have one command-line to get the full dataset for their app :)

Sep 17 '15 03:09 robe2

Actually, I'm using the extra naming variants in order to find the "source" and "target" in the first place. I'm not doing any rendering at all.

Second, a normalized database and join tables is fine by me and "the way things ought to be done" etc etc. I just don't like having to run two different OSM ingesting programs that might do different things and/or build duplicate tables. I've only just started using pgrouting, and I don't (yet) know how to follow @woodbri's approach of creating a topology table for routing.

I guess what I was looking for was a config flag to say "while you're scanning the OSM data, hey by the way, load all this other junk too" ...probably best in a separate tags table with an intervening join table.

Sep 17 '15 03:09 jmarca

@jmarca my thoughts exactly. Why ingest twice especially since osm2pgrouting is scanning the tables already, it will be much faster if it does both similar to what it does with osm_nodes table.

Sep 17 '15 07:09 robe2

@jmarca I have thought about the HSTORE option to load all the remaining OSM tags, and I like the idea. Of course a pull request would be very welcome!

On the other hand, OSM ways sometimes consist of multiple road segments and need to be split therefor by osm2pgouting to have a routable network in the end. In that case the question arises, what to do with the attributes: most attributes may be the same for every road segment (ie. name, surface, etc.) and can be copied, but there may be some values that might need to be split at the ratio of the geometry (ie. real travel time). I think this is a minority of cases, but it may happen and I don't have a good idea how to resolve (detect) this.

Sep 24 '15 14:09 dkastl

I presume that when we split ways that each "segment" in the database has a unique id AND a reference to the original OSM way id. If we are loading the data as a convenience/service then I'm not sure we need to split the original ways; all we need to provide is a back reference to the original way id, and then the user can fetch the original data and decide how to proceed with that. I do not think we want/should assume responsibility for splitting the original ways because we do not know how the user intends to use that data. If we split it they might need the data un-split for their application.

Sep 24 '15 14:09 woodbri

We are giving back the osm_id of the original un-splitted segment of the osm data so they can actually get the un-splitted segment. And we are splitting the segments where there is an intersection with other segments. otherwise we wouldn’t have routeable information.

Sep 24 '15 14:09 cvvergara

osm2pgrouting osm2pgrouting copied to clipboard

Question about how to load alternate names

----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------

osm2pgrouting
osm2pgrouting copied to clipboard