hootenanny
hootenanny copied to clipboard
ConflictsNetworkMatcher::_seedEdgeScores() is a major bottleneck for the Network Roads alg against large datasets
See network-tests.child/gaalkacyo.child/gaalkacyo_large.child
. I'm sure other large road datasets reveal the bottleneck as well. The previous Network alg bottleneck was in the removal of duplicate edges. That has since been handled and now this seems to be the culprit.
See if anything can be done to speed things up. If the alg can't be improved, then maybe it can be paralleized.
see #3534 and #3530
Short of reworking the algorithm itself, which is unlikely at this time, I can only think of looking into caching and maybe some basic efficiency checks for this. Maybe there's an opportunity for parallelization?...
Not really seeing much that can be optimized here. EdgeMatchSetFinder::_addEdgeMatches
specifically is where most of the time is spent. Since its a recursive algorithm not seeing much that can be optimized. No one part of it is very slow, but the iterations of it baloon for large datasets.
Found this setting: network.edge.match.set.finder.max.iterations
, which controls the recursion in EdgeMatchSetFinder::_addEdgeMatches
. It doesn't look like an optimal value for it has ever been found with optimize-network-conf
, so I'll do that and see if we can lower the default from 20. Also, will make it configurable from the UI, so it can be adjusted for larger datasets if desired.
Dropping network.edge.match.set.finder.max.iterations
drastically from 20 to 1, reduces the runtime for gaal large by ~300%. This of course, says nothing for what effect it has on the conflation output quality...so will determine the best value.
Unfortunately, reducing network.edge.match.set.finder.max.iterations
isn't possible. One test, pap-008, requires it to be at its current value of 20 to conflate the input data properly. All other tests will pass with it as low as 11. Running at 11, lowers gaal large conflate runtime by ~17%...not a ton but would be helpful. Regardless, lowering it isn't an option.
Going to look at parallelization opportunities...
Parallelized it but am getting beat up by OsmSchema
Singleton access. ~~Its buried so deep in the code, not sure what I can do about it.~~ Have an idea how to fix this...
I tried using std::call_once
to limit creation of OsmSchema
. I think that's the right thing to do, but it doesn't matter b/c there's enough other thread unsafe code in OsmSchema
and elsewhere that I'm throwing the towel on parallelization here. Will leave the issue open in case we think of anything else in the future.
The data from #3923 is also very slow to conflate with Network.