sources_to_targets returns different results when splitting request
Valhalla sources_to_targets returns different results if the request is split up. In the example below it returns different values when a 4 x 4 matrix is instead requested as four 1 x 4 matrices. The difference in this case is in request 1 with the 3rd value, 975 seconds and 8.08km instead of 1024 seconds and 9.188 km.
For each source->target connection Valhalla should return the same values every time, regardless of sources or targets in the request.
test script
#!/bin/sh
url=https://valhalla1.openstreetmap.de/sources_to_targets
# One request 4x4 matrix
curl "$url" -o 4x4.json -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","costing_options": {"auto": {"top_speed": 80}},"sources": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"targets": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"verbose": false}'
# Four requests 1x4 matrix
curl "$url" -o 1x4-0.json -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","costing_options": {"auto": {"top_speed": 80}},"sources": [{"lon": 11.212600,"lat": 52.712030}],"targets": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"verbose": false}'
curl "$url" -o 1x4-1.json -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","costing_options": {"auto": {"top_speed": 80}},"sources": [{"lon": 11.070770,"lat": 52.871080}],"targets": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"verbose": false}'
curl "$url" -o 1x4-2.json -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","costing_options": {"auto": {"top_speed": 80}},"sources": [{"lon": 11.158565,"lat": 52.850734}],"targets": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"verbose": false}'
curl "$url" -o 1x4-3.json -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","costing_options": {"auto": {"top_speed": 80}},"sources": [{"lon": 11.158199,"lat": 52.851318}],"targets": [{"lon": 11.2126,"lat": 52.71203},{"lon": 11.07077,"lat": 52.87108},{"lon": 11.158565,"lat": 52.850734},{"lon": 11.158199,"lat": 52.851318}],"verbose": false}'
response 4x4
{
"sources_to_targets":{
"durations":[
[ 0, 2522, 1648, 1666],
[2510, 0, 1024, 1042],
[1625, 966, 0, 17],
[1645, 986, 17, 0]
],
"distances":[
[ 0.0, 27.881, 19.503, 19.574],
[27.832, 0.0, 9.188, 9.259],
[19.248, 8.044, 0.0, 0.071],
[19.319, 8.115, 0.071, 0.0 ]
]
},
"units":"kilometers",
"algorithm":"costmatrix"
}
response 0 1x4
{
"sources_to_targets":{
"durations":[
[ 0, 2522, 1648, 1666]
],
"distances":[
[0.0, 27.881, 19.503, 19.574]
]
},
"units":"kilometers",
"algorithm":"costmatrix"
}
response 1 1x4
{
"sources_to_targets":{
"durations":[
[2510, 0, 975, 1042]
],
"distances":[
[27.832, 0.0, 8.08, 9.259]
]
},
"units":"kilometers",
"algorithm":"costmatrix"
}
response 2 1x4
{
"sources_to_targets":{
"durations":[
[1625, 966, 0, 17]
],
"distances":[
[19.248, 8.044, 0.0, 0.071]
]
},
"units":"kilometers",
"algorithm":"costmatrix"
}
response 3 1x4
{
"sources_to_targets":{
"durations":[
[1645, 986, 17, 0]
],
"distances":[
[19.319, 8.115, 0.071, 0.0 ]
]
},
"units":"kilometers",
"algorithm":"costmatrix"
}
Different locations in the request make the differences appear at different places in the matrix. For exmple removing the costing_options here would change the distance value in response of the 4x4 matrix from 19.319 to 19.317, but in the response 3 it remained at 19.319, conversely the response difference for request 1 from above would disappear.
I have fixed quite a number of bugs relating to trivial and semi-trivial matrix connections, this looks like there might be more. Have you tried narrowing it down to fewer locations? These are notoriously hard to debug, especially if the connection with the faulty results is itself non-trivial.
I tried to reduce the number of locations, here is a difference between a 1x1 and 1x2 matrix at the firstmatrix value.
#!/bin/sh
url=https://valhalla1.openstreetmap.de
curl "$url/sources_to_targets" -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","verbose": false, "sources": [{"lon": 11.158199, "lat": 52.851318}],"targets": [{"lon": 11.153061, "lat": 52.85342},{"lon": 11.153542, "lat": 52.854502}]}'
curl "$url/sources_to_targets" -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","verbose": false, "sources": [{"lon": 11.158199, "lat": 52.851318}],"targets": [{"lon": 11.153061, "lat": 52.85342}]}'
{"sources_to_targets":{"durations":[[294,191]],"distances":[[0.892,0.785]]},"units":"kilometers","algorithm":"costmatrix"}
{"sources_to_targets":{"durations":[[284]],"distances":[[0.91]]},"units":"kilometers","algorithm":"costmatrix"}
Using valhalla to calculate the route for that value gives time: 294.657 and length: 0.892 which matches the request with 2 target:
curl "https://valhalla1.openstreetmap.de/route" -X POST -H 'Content-Type: application/json' -d '{"costing": "auto","locations": [{"lon": 11.158199, "lat": 52.851318},{"lon": 11.153061, "lat": 52.85342}]}'
IIRC there's some contextual details which influence the exact path, but I can't remember exactly.. I think it's aborting the search based on smth that's influenced by the max distance among all the pairs or smth like that.
I just tried to reproduce it with a local setup (using the same tiles as the public server and valhalla_service built from latest master) but got the same results across the two matrix requests and the route request (all give me a total time of 284).
But I am hitting something way weirder: retrying the requests on the public instance gives different results! Sometimes I get the route/matrix result that takes 294 seconds, sometimes the one that takes 284... I can't reproduce this on my machine using the exact same tiles and the same build configuration.
Update: also tried a Docker build to get a little closer to the public server setup, but still not reproducible... I'll keep this issue open because it's clearly occurring on the public server, but it's impossible to debug something I can't reproduce on a local build with a debug configuration
It occured for me on a private instance, and I also had trouble reproducing this with the same data on the public server, had to try some other data, not the same tiles and config obviously. Have you tried a large matrix and then compare the result with the result of a request for a sub matrix to see if it occurs at all?
Just tried that and this time I can reproduce it consistently, both on the server as well as locally. I was investigating a similar issue recently, so I went with my gut feeling and increased kMinIterations in costmatrix.cc from 100 to 2000. I didn't check all the pairs, but this definitely fixes pair 1 to 2 (in the 4x4, so 0 to 2 in the 1x4 respectively). I've been advocating to increase the thresholds to keep looking for a better connection once a path has been found, since these thresholds are much more conservative than in the routing algorithms. Or at least make it configurable so that you can decide yourself how much performance you are willing to trade off against optimality.