brouter icon indicating copy to clipboard operation
brouter copied to clipboard

Possible extensions in the Brouter with news tags

Open EssBee59 opened this issue 2 years ago • 144 comments

Hello, I would like to discuss extensions in the brouter! Many enhancement-requests are already posted, here some examples: Noise https://github.com/abrensch/brouter/issues/476 Green area https://github.com/abrensch/brouter/issues/258 Round trip https://github.com/abrensch/brouter/issues/460

In the following I will consider (as example) these 3 enhancements: -consider the noise / pollution on the route -consider a river or see on the side of the route -consider the "green" aspect of the route (within a forest or park…) The basic idea is to introduce new calculated (or estimated) tags in the RD5 files. (in a similar way as the existing tag „estimated_traffic_class»)

Of course, the routing engine have also to be extended in order to support these tags, and the profiles have to use them according to the preferred route.

1- The first challenge is to calculate / estimate the tag values for the concerned highways: I started some tests and this seems possible(see documentation in pdf file):

documentation was updated

2- Extension of the RD5 files with the new tags 3- Extension of the routing engine to support the new tags (+ lookups.dat) 4- Extension of profiles Next steps: A first look at the results of the spatial-SQL´s (GIS) is very promising: the calculated values for the new tags seems good and usable. -a decision, which tags exactly should be implemented, can be made later -a further challenge is the calculation of the tag value for the planet, as it will take a lot of time! So I suggest next to test the impact of the tags « noise » and « river » on the routing within a regional osm-map (as example 30 km * 30 km).

The tags values are available, but implementing Extension(2) and (3) is prerequisite to start “real” tests. Is anybody ready to work on this extension / to create a prototype for testing(Extensions 2 and 3)?

EssBee59 avatar Nov 19 '22 08:11 EssBee59

I'm sorry for the delay.

It's not that I haven't heard the wishes. But all my resources are in the 'final stages' for the 1.6.4 library. And usually I want to have sampled some code snippets before making a statement/commitment.

So all I can say is: yes, I'm interested, I find it useful, it might start for me after 1.6.4

afischerdev avatar Nov 23 '22 10:11 afischerdev

I'm sorry for the delay.

It's not that I haven't heard the wishes. But all my resources are in the 'final stages' for the 1.6.4 library. And usually I want to have sampled some code snippets before making a statement/commitment.

So all I can say is: yes, I'm interested, I find it useful, it might start for me after 1.6.4

Thank for your reply and interest! As the extention is not urgent, no problem to wait... As I am not able to help in "1.6.4" library, I try to start some tests with new tags... the main challenge for me is to generate a rd5 file with new tags ... (if some one can help!) regards

EssBee59 avatar Nov 23 '22 10:11 EssBee59

Hello,

I think, my prototype using the new tag "estimated_noise" is now running! my first tests are interesting... two good news:

  • no extention was necessary inthe brouter itself (only in the mapcreation process, lookups.dat and scripts)
  • The size of the rd5 files only grows by 2%

EssBee59 avatar Nov 28 '22 19:11 EssBee59

What did you do for that?

I had some time in longer train runs. So I started investigation on this issue as well.

afischerdev avatar Nov 30 '22 17:11 afischerdev

Hello,

Enclosed a short description of my first tasks (add "estimated_noise_class") how_to_create_noise_tags.txt

I had only to create this new script to add the new tag in OSM (perhaps you can reuse the programm used by Arndt for "estimated_traffic_class" OsmTrafficMap.java?? , but it was for me simpler to use a new script) add_tags_germany.sh.txt

Further tags (river, forest..) are also prepared... This week, I try tp create the RD5 for full Germany (I could create the "noise" data in 20 minutes, but the "forest" need much longer time) till yet only a part of Hesse is available, but I could send these RD5 if you are interested to test Regards

EssBee59 avatar Dec 01 '22 07:12 EssBee59

perhaps you can reuse the programm used by Arndt for "estimated_traffic_class" OsmTrafficMap.java?

estimated_traffic_class is injected in the "WayLinker" step, the very last step of preprocessing, but the first step where the ways and their geometry are together. But at this step, the way-id is no longer present.

There's another step where I inject pseudo-tags (the relation-pseudo-tags): RelationMerger.java

At this step there is no geometry, but there is the way-id. So if you have a preprocessor that generates pseudo-tags connected to the osm-way-id, then that' proabably the best location to inject them.

estimated_traffic_class is not a succes story. I had a hard time integrating it into the preprocesor without adding too much latency, I has problems with a "noisy" behaviour blowing up the rd5-delta files. And I never achieved world-coverage, but only europe.

abrensch avatar Dec 01 '22 08:12 abrensch

So if you have a preprocessor that generates pseudo-tags connected to the osm-way-id, then that' proabably the best location to inject them.

sorry if you have your data ready the best location i of course the very first step of preprocessing ( see OsmCutter.generatePseudoTags )

Relation merging must be done in a later step because when reading the PBF-File, the relations appear after the ways.

abrensch avatar Dec 01 '22 10:12 abrensch

@EssBee59 Thanks for the description.

I started an other way - the BRouter classic way. Ground class is an ExtraDataCollector. It contains the definition we like to collect e.g. natural=wood The OsmCutter greps all the extra data and writes it to an extra data file. An ExtraDataCutter5 splits it as usual. The WayLinker checks if a way has an ExtraData entry and add a new variable like 'estimated_noise' and its value. I do this on the test osm file - it is readable in an editor and has fast results - no speed tests at the moment, growing only a few bytes.

At the moment I thinking about the relation problem @abrensch annouced. E.g. Dreieich: relation id 6831654 has natural=scrub definition. The member way 263958248 has no definition tags at all. So it will never get into our way data tiles. I would like to do a pre-process for relationships. And then the member ways shall inherit the tags - if needed. The question for me is should the childs get other tags as well? E.g. a relation could have a maxspeed info. When a member way has no maxspeed, should it get this relation value?

The repo follows.

afischerdev avatar Dec 02 '22 09:12 afischerdev

Hello afischerdev, I am not real expert in OSM, so I can not help about relationship.. Of course the way I choosed with spatial SQL ist complex, but the "geometry" (distance between the lines / polygons) was a good solution: image

as example above, does a relation allready exist in osm between the cycleway and the motorway? (With spatial SQL, the intersection surfaces are used to estimate a tag value)

I could create now the RD5 for full Germany. (A challange was the processing time for "forest" data in postgreesql) So I had to redesign the database (table and indexes( and the SQL...

the processing times... (rough results)

processing full germany (pbf of 4 GB) time in minutes (first version), time in minutes (2 nd version)

1- import "pbf" in postgresql using osm2psql 30 30 2- redesign database / create new tables na 30 4- create noise data 30 30 5- create river data >600 10 6- create forest data 900 30 7- add the new tags in osm file 16 16 8- generate RD5 (as input 70 GB osm file) 10 10

notes:

  • a change in the "osm2psql" programm could suppress THE step 2 (time saving)
  • step 7 could probably be speed up by using the standard java programms from Arndt

EssBee59 avatar Dec 02 '22 14:12 EssBee59

hello,

I could generate RD5 files for Germany containing the new tags (noise, river and forest) If any body intend to test and evaluate the real use of the new tags, I could deliver these files + lookups + new profiles... or I could install a new brouter test-instance on the server! (my user "essbee" has allready a test instance fpr brouter-web but not for brouter itself)

EssBee59 avatar Dec 03 '22 15:12 EssBee59

orry if you have your data ready the best location i of course the very first step of preprocessing ( see OsmCutter.generatePseudoTags )

@ abrensch:

Thank for the help! So instead of adding the new tags in a preprocessing step with a perl script I could extend the standard OsmCutter for this task.

(with a new “OsmCutter.generateSpecialTags” and using a JDBC connection to postgresql to get the new tags)

Example for Germany map, into 17.831.000 “highway” objects 10.405.000 new tags are added (6.250.000 for forest, 1.656.000 for river and 2,456,000 for noise) The processing time (including generation of the rd5 in the last step) was about 50 minutes (first version with 1 sql per way), in a second version (load all tags in 3 hast tables with 3sql) the time was reduced to 12 minutes (but it cot some memory!)
(of course, the new tags were previously created within the database)

EssBee59 avatar Dec 11 '22 15:12 EssBee59

new documentation for the tag-calculation:

Tag_calculation_V3.pdf

EssBee59 avatar Dec 14 '22 09:12 EssBee59

Thanks for update and for the report, sounds good.

I'm still fixed on the Dreieich test data. But I'm not happy with it. At the end all needed points are controlled to the possible points. That give a feeling like an endless loop - when using it in bigger pbfs. So I think I'll give up this part for now. Pre computering seems to be the better way.

afischerdev avatar Dec 14 '22 17:12 afischerdev

I'm now prepared to use database export/import. The import for BRouter is file based, so no extra library is needed and the database generation of import files could be placed on an other server then rd5 generation.

Used this 'green' filter for import - should be discussed:

highway not in ('motorway', 'motorway_link', 'trunk', 'trunk_link', 'primary', 'primary_link', 'secondary')
landuse in ('forest', 'allotments', 'flowerbed', 'orchard', 'vineyard', 'recreation_ground', 'village_green')
leisure in ( 'park', 'nature_reserve')
natural in ( 'grassland', 'meadow', 'scrub', 'wood'  )

I ended up in the same problems with timing. Used some examples for optimization from here - database gives us the chance of index, but wasn't very happy with this.

No profile with extra data for now. I only control on output. brouter_trekking_0.geojson.txt

Please see my repository

@EssBee59 What did you do for speedup?

afischerdev avatar Dec 30 '22 17:12 afischerdev

Hello!

Yes, the green filter for import is a good idea, as it will reduce the size of the planet_osm_table. (reducing the load time of the database, but I think, it will nor reduce the processing time of the tags)

my tasks / test in the last weeks:

A- I could install a “test” instance with 3 new tags on the BRouter server http://brouter.de/essbee/#map=13/49.9950/8.6407/osm-mapnik-german_style

Restriction: currently only the rd5 from Germany are installed Feature: 2 scripts (fastbike-verylowtraffic and trekking-essbee) were extended to support the new tags (consider_noise, consider_river & consider_forest)

My prefered test cases:

Test_New_Tags.pdf

B- I also tried to generate the new tags for “Europe” Size of the pbf file: 28 GB Platform: note book with windows 11 (AMD Ryzen 9 5900HX and 16 GB RAM)
Load time (osm2pgsql) 9:30 hours Generation of the noise tags 4 hours Generation of the river tags no ending … Generation of the forest tags not ending…

I think 2 problems occur there: 16 GB of RAM is definitely not enough for the job! It seems, the postgresql database do not scale lineary

Test database parameter: I tested with 6, 8 and 10 GB of memory (example  shared_buffers = 8240MB), but the SQL´s were very very slow / not ending + paging

To minimize the memory need I tried to work without “parallel seq scan” (by ==> adding max_parallel_workers_per_gather = 0), but same result as whis parallel scan.

I will try another way: split the big pbf file (Europe or planet), as example with squares of max 10 degres, calculate the new tags for each square and concatenate the results in a single table. It could be then possible to run OsmCutter for any part of the planet (or the full planet by need) by using the final table above.

Could this be a way to proceed?

(I made some changes in the sql´s and in OsmCutter.java: do you need the last versions to test?)

EssBee59 avatar Dec 31 '22 10:12 EssBee59

Could this be a way to proceed?

Yes, I think so. A bounding box for the database generation is needed. I already have a database function for the BRouter tile index logic (see OsmCutter.getNameForTile) to split BRouter incoming extra data files. But we should use only Europa as target. Means planet file with European extra data.

(I made some changes in the sql´s and in OsmCutter.java: do you need the last versions to test?)

Would be nice, please let me know.

afischerdev avatar Dec 31 '22 10:12 afischerdev

Here the sql and java! 2 java versions:

  • one is faster as it loads all the new tags with only 3 sql in hash tables (but this cost a lot of memory!)
  • the other version (that I prefer) posts a "select" on the tags-table ( using jdbc) for each "way"... this cost time but no additional memory... (processing "europe" with this version takes 2 hours and some minutes)

DB_germany.bat.txt

extend.sql.txt

noise.sql.txt river.sql.txt forest.sql.txt all_tags.sql.txt OsmCutter.java.HashTable.txt OsmCutter.java.txt

EssBee59 avatar Dec 31 '22 11:12 EssBee59

About splitting... I thougt, it could be made with osmconvert (-b parameter), but if you have a running solution, I would test with it!

EssBee59 avatar Dec 31 '22 11:12 EssBee59

Thanks for the files. You didn't use an extra index is that correct?

My focus at the moment is to reduce database size (brouter.style imports only used columns) and make some index to decrease the import speed. But it's still a lame duck in my eyes.

Please see scripts in misc folder

afischerdev avatar Jan 03 '23 12:01 afischerdev

Hello afischerdev,

As I was not satisfied with the behaviour of the database qith big tables ( select's hang often) I redesigned my database and the sql´s. No real problem now to generate "europe" within a run (of course, with 16 GB RAM it takes 24 h, as paging slows down the system) I will send the new sql´s later...

A question: Some years ago I had a discussion with Arndt about "town" (How to avoid routing through a big town?) Now with the database a solution is possible...by introducing a 4th tags "estimated_town_class".

town2.sql.txt

Do you see a need for this tag? (you see, the import-filter should be adapted)

EssBee59 avatar Jan 04 '23 09:01 EssBee59

Nice idea. I had a short check on my Hessen data. I found only two towns with more then 80000. And Darmstadt has a lot of green areas around.

I enabled population in my scripts (brouter.style and brouter_start.sql).

afischerdev avatar Jan 04 '23 11:01 afischerdev

Nice idea. I had a short check on my Hessen data. I found only two towns with more then 80000.

Hello,

strange, in the table planet_osm_polygon only few cities are stored using the "default" import: As example for Germany I did not found this city...which is in the osm data.

<node id="27418664" lat="50.1106444" lon="8.6820917">
	<tag k="alt_name" v="Mainhattan;FFM;Frankfurt a. M."/>
	<tag k="alt_name:en" v="Frankfort"/>
	<tag k="ele" v="112"/>
	<tag k="name" v="Frankfurt am Main"/>
	<tag k="name:an" v="Frankfurt d&#39;o Main"/>
	<tag k="name:ar" v="فرانكفورت"/>
	<tag k="name:ca" v="Frankfurt del Main"/>
	<tag k="name:cs" v="Frankfurt nad Mohanem"/>
	<tag k="name:de" v="Frankfurt am Main"/>
	<tag k="name:el" v="Φρανκφούρτη"/>
	<tag k="name:en" v="Frankfurt"/>
	<tag k="name:eo" v="Frankfurto"/>
	<tag k="name:es" v="Fráncfort del Meno"/>
	<tag k="name:fr" v="Francfort-sur-le-Main"/>
	<tag k="name:he" v="פרנקפורט"/>
	<tag k="name:hr" v="Frankfurt na Majni"/>
	<tag k="name:hsb" v="Frankobrod nad Mohanom"/>
	<tag k="name:hu" v="Frankfurt am Main"/>
	<tag k="name:it" v="Francoforte sul Meno"/>
	<tag k="name:ja" v="フランクフルト・アム・マイン"/>
	<tag k="name:ko" v="프랑크푸르트"/>
	<tag k="name:la" v="Francofurtum ad Moenum"/>
	<tag k="name:lt" v="Frankfurtas prie Maino"/>
	<tag k="name:lv" v="Frankfurte pie Mainas"/>
	<tag k="name:mk" v="Франкфурт на Мајна"/>
	<tag k="name:ms" v="Frankfurt"/>
	<tag k="name:nds" v="Frankfort an’n Main"/>
	<tag k="name:pl" v="Frankfurt nad Menem"/>
	<tag k="name:ro" v="Frankfurt pe Main"/>
	<tag k="name:ru" v="Франкфурт-на-Майне"/>
	<tag k="name:sk" v="Frankfurt nad Mohanom"/>
	<tag k="name:sl" v="Frankfurt na Majni"/>
	<tag k="name:sr" v="Франкфурт на Мајни"/>
	<tag k="name:uk" v="Франкфурт-на-Майні"/>
	<tag k="name:ur" v="فرینکفرٹ آم مین"/>
	<tag k="name:zh-Hans" v="美因河畔法兰克福"/>
	<tag k="name:zh-Hant" v="美茵河畔法蘭克福"/>
	<tag k="official_name" v="Frankfurt am Main"/>
	<tag k="place" v="city"/>
	<tag k="population" v="701350"/>
	<tag k="ref:LOCODE" v="DEFRA"/>
	<tag k="reg_name" v="Frankfurt"/>
	<tag k="short_name" v="Frankfurt"/>
	<tag k="wikidata" v="Q1794"/>
	<tag k="wikipedia" v="de:Frankfurt am Main"/>
</node>

we have possibly to make changes in the import of osm2pgsql?

EssBee59 avatar Jan 04 '23 18:01 EssBee59

I promised my new SQL version! (interesting if you want to generate europe in one step with few RAM)

newSql.zip

Note/ prerequisites to generate europe: 500 GB free storage (230 GB database + 160 GB tmp tables!) my database config: shared_buffers = 12240MB max_parallel_workers_per_gather = 0 (no parallel actions to limit the memory usage)

EssBee59 avatar Jan 04 '23 19:01 EssBee59

Frankfurt:

I think we could make an update on the polygon data. Something like this

SELECT pt.osm_id, x.id, pt.name, pt.population pop, x.admin_level
 FROM planet_osm_point pt
 JOIN (
    SELECT member AS osm_id, id, p.admin_level
    FROM (
        SELECT unnest(parts) AS member, id
        FROM planet_osm_rels
        WHERE NOT ARRAY['population'] <@ tags
        ) u, planet_osm_polygon p
        WHERE p.osm_id*-1 = u.id
    ) x
 USING(osm_id)
 WHERE pt.population is not null
 ORDER by pt.population::decimal desc
 ;

But this is also covered with questions: We would include Hessen as area. So I added admin_level to select. Please see admin_level Is not equal in use in every country. Will not be the end.

afischerdev avatar Jan 06 '23 10:01 afischerdev

A note on relations:

From the 'Dreieich' data I focused on relation 6831654 and way 148463112. The relation contains four ways and two of them have no tags. In the past I have tried to generate these. Now on database use I see the database import already give a path to the relation. I like that feature. It doesn't require a second pass and is ready to use.

But another 'little problem' arises. If a relation has a tagged way, the same constellation appears twice. (tag changes by hand)

    w_id | p_id | road | green_factor
-----------+-----------+---------+---------------- ----
  148463112 | -6831654 | tracks | 0.9965243870745802
  148463112 | 263958233 | tracks | 0.4062599369714373
  148463112 | 263958248 | tracks | 0.9965243870745802

Other remark on 'green factor'

A way inside a forest like 1082304710 this is the best we can reach. A way on the forest border 1082304706 is semi optimal but can reach a higher level then the inside forest way. It gets values from forest and other

    w_id    |   p_id   | highway |    green_factor
------------+----------+---------+--------------------
 1082304710 | -6833156 | path    | 0.9955203890731116
 1082304710 | 39306205 | path    | 0.0771948143192361

    w_id    |   p_id    | highway |    green_factor
------------+-----------+---------+---------------------
 1082304706 |  -6833156 | track   |  0.9694724948047176
 1082304706 | 251920267 | track   |  0.3863178127417932
 1082304706 | 251920283 | track   |  0.3163575831981074
 1082304706 | 251920285 | track   |  0.1501170101172784
 1082304706 | 251920305 | track   |  0.3283513763800288
 1082304706 | 251920307 | track   |  0.9527799094096389
 1082304706 | 251920331 | track   | 0.36070424194288164
 1082304706 | 251920334 | track   |  0.2382312526047835

afischerdev avatar Jan 06 '23 11:01 afischerdev

One more remark on Darmstadt.

I collected all greens inside the Darmstadt polygon and wrote it to an extra table:

SELECT p.osm_id p_id, l.osm_id l_id, p.name, l.landuse, l.way
 into table pop_sub
 FROM planet_osm_polygon AS l
 LEFT JOIN planet_osm_polygon AS p
   ON ST_DWithin(l.way, p.way, 70)
   where p.osm_id=-62581
   and l.landuse in ('forest','allotments','flowerbed','meadow','orchard','vineyard','recreation_ground','village_green')
   ;

And then the difference from the original Darmstadt polygon gives this geojson file - view at https://geojson.tools/ file import. darmstadt.geojson.txt

afischerdev avatar Jan 06 '23 15:01 afischerdev

Hello,

thank for the update! a lot of material, I need time to analyze, as I follow another way now using "lua" config file for import.. (it seems helpfull for flexibility and better performances as only the needed data can be generated)

curently I get for Germany the following cities order by population: town_li.log

EssBee59 avatar Jan 07 '23 09:01 EssBee59

my first result for town tags (region francfort with Frankfurt + Offenbach + Darmstadt) town_tags

looks fine... only the official / administrative surface for Darmstadt has a part in forest ... ("all greens" could be eliminated as you commented above) so not perfect... (but calculation should only apply on big towns to avoid that)

My first problem was to find also Frankfurt and Offenbach, and this could be solved

EssBee59 avatar Jan 07 '23 14:01 EssBee59

hello, this sql removes all "town" tags where a green tag exists: delete from town_tags where losmid in (SELECT losmid FROM forest_tags);

(nearly 1 of 3 tags are deleted for Darmstadt!) not perfect... but much better as before...

darmstadt_unfiltered

darmstadt_filtered

EssBee59 avatar Jan 07 '23 19:01 EssBee59

@EssBee59 Great idea, I like it. No more area different selection, very elegant.

afischerdev avatar Jan 08 '23 14:01 afischerdev