osrm-backend
osrm-backend copied to clipboard
Do not generate intermediate .osrm file in osrm-extract.
Issue
closes https://github.com/Project-OSRM/osrm-backend/issues/6329
Tasklist
- [ ] CHANGELOG.md entry (How to write a changelog entry)
- [ ] update relevant Wiki pages
- [ ] add tests (see testing documentation)
- [ ] review
- [ ] adjust for comments
- [ ] cherry pick to release branch
Requirements / Relations
Link any requirements here. Other pull requests this PR is based on?
Some things to discuss:
- Now in order to generate
.osrmfile user has to pass--dump-nbg-graphflag toosrm-extract. Is it a good naming? Should we may be make it default and allow to disable by CLI option? I can imagine that someone somehow relies on this file and usesosrm-componentswith it(it is really low chance IMO, but still). - The only place where this
.osrmcan be used isosrm-componentstool which is as I understand purely for debug, so IMO.osrmfeels a bit "loud" name for such kind of files. May be rename it to something? - We suggest to pass this
.osrmfile everywhere in the docs as if it is really used as input file(e.g../osrm-customize ./berlin-latest.osrm). Actually it is just a base name. So I suggest to just remove this extension from everywhere: like./osrm-partition ./berlin-latest. But for backward compatibility leave the ability to work if it was passed in the old way(we may emit warning if it used this way). I can imagine that a lot of people have working pipelines with this.osrmin them.
@mjjbell WDYT?
@mjjbell can you take a look again?
And also let's discuss these questions:
Some things to discuss:
- Now in order to generate
.osrmfile user has to pass--dump-nbg-graphflag toosrm-extract. Is it a good naming? Should we may be make it default and allow to disable by CLI option? I can imagine that someone somehow relies on this file and usesosrm-componentswith it(it is really low chance IMO, but still).- The only place where this
.osrmcan be used isosrm-componentstool which is as I understand purely for debug, so IMO.osrmfeels a bit "loud" name for such kind of files. May be rename it to something?- We suggest to pass this
.osrmfile everywhere in the docs as if it is really used as input file(e.g../osrm-customize ./berlin-latest.osrm). Actually it is just a base name. So I suggest to just remove this extension from everywhere: like./osrm-partition ./berlin-latest. But for backward compatibility leave the ability to work if it was passed in the old way(we may emit warning if it used this way). I can imagine that a lot of people have working pipelines with this.osrmin them.@mjjbell WDYT?
For 1 I doubt someone really relies on it and it is super straightforward to just pass this additional flag if it is really needed.
For 3 I now more inclined to leave this .osrm extension everywhere with a note that it is not a specific file anymore, but base path for all other files we need(all of them have .osrm.* pattern in name)
Now in order to generate
.osrmfile user has to pass--dump-nbg-graphflag toosrm-extract. Is it a good naming? Should we may be make it default and allow to disable by CLI option? I can imagine that someone somehow relies on this file and usesosrm-componentswith it(it is really low chance IMO, but still).
The only place where this
.osrmcan be used isosrm-componentstool which is as I understand purely for debug, so IMO.osrmfeels a bit "loud" name for such kind of files. May be rename it to something?
It's the internal node-based-graph representation prior to compression, creation of dummy edges and other stuff that doesn't exist in the OSM data. So I would say it's the correct arg name, but I agree that it's not obvious that this should generate an .osrm file.
Renaming it to .nbg seems like a good fit with the other toolchain output, given we have already have the OSM IDs and coordinates in .nbg_nodes, and the compressed graph in .cnbg.
We suggest to pass this
.osrmfile everywhere in the docs as if it is really used as input file(e.g../osrm-customize ./berlin-latest.osrm). Actually it is just a base name. So I suggest to just remove this extension from everywhere: like./osrm-partition ./berlin-latest. But for backward compatibility leave the ability to work if it was passed in the old way(we may emit warning if it used this way). I can imagine that a lot of people have working pipelines with this.osrmin them.
Thinking longer-term, I can see an approach where we pass around a metadata/config file, that includes information about the settings used during processing, and detailing which datasets have been generated (e.g. if we wanted to support optional loading), which maybe could be a good use for an .osrm file? In which case it would be beneficial to keep the instructions for passing that as the argument. What do you think?
For 3 I now more inclined to leave this .osrm extension everywhere with a note that it is not a specific file anymore, but base path for all other files we need(all of them have .osrm.* pattern in name)
Sorry I'm reading everything in the wrong order. If that's the case, then I agree, and it also aligns with the potential of a config/metadata file in the future.
For 1 I doubt someone really relies on it and it is super straightforward to just pass this additional flag if it is really needed.
Agreed. Also, given the .osrm file is marked as temporary in the toolchain, I don't think users can have expectations of stability.