vroom
vroom copied to clipboard
More unassigned when using 1 vehicles instead of 2, while creating always 1 route only
Hi,
I'm facing a pretty unexpected behaviour regarding unassigned jobs. I'm running two scenarios:
Scenario 1: I'm passing to Vroom 10 jobs and 1 vehicle only -> 9 jobs are assigned, 1 route is created and I get 1 unassigned. So far, so good. This scenario is included in the self-contained instance "my_problem_1_vehicle.json" attached.
Scenario 2: I'm passing to Vroom the exact same 10 jobs BUT 2 identical vehicles -> all 10 jobs are assigned and 1 route is created. This is pretty surprising, not because all 10 jobs are assigned, but because all 10 jobs are assigned to 1 route only (see self-contained instance "my_problem_2_vehicles.json" attached).
Moreover, if I run Vroom on the steps identified by scenario 2 in plan mode, I get no violation.
Data for the problem: instances.zip. I'm running v1.13.
What can be the issue? Is there a reason why passing 2 vehicles instead of 1 leads to a better solution?
TL;DR we "simply" provide a suboptimal solution on the first instance with a single vehicle.
Longer version:
Is there a reason why passing 2 vehicles instead of 1 leads to a better solution?
Yes, the reason is that we do not have a guarantee of optimality. Because a solution for scenario 2 that uses a single vehicle is also a solution for scenario 1, this should indeed not happen provided you always get optimal solutions from the solving engine.
I agree this is counterintuitive to get a better solution by just adding a vehicle that seems unused. What actually happens is:
- For scenario 1, the solution we get is much cheaper and "natural" if you look at the geometry, and the leftover job is not included to that plan due to TW constraints. So no matter what, our search process never manages to reach the "full solution" that is much more expensive and much more convoluted, in fact the solution 1 is actually already a heuristic solution, meaning it has not been improved by the local search. This is because most of our approach targets reducing the cost, while in this situation we'd have to reach to a much more expensive solution to be able to include the missing job.
- For scenario 2, the added vehicle is not actually unused: the heuristic solutions all use two vehicles in the first place. In that case, the setup is different and the local search is able to reach the 1-vehicle solution by reducing overall cost.
Hi,
Thanks for the answer, very clear. From an algorithmic perspective, I see why you say that solution 1 is "much cheaper and natural" than solution 2. However, for the use case I'm modelling the solution 2 is "much better" in reality: in other words, each unassigned job has a "hidden" cost that is part of the "real-word objective function".
Do you think there is a way to drive the heuristic/search algorithms to evaluate that solution 2 is better than solution 1?
Thanks!
Do you think there is a way to drive the heuristic/search algorithms to evaluate that solution 2 is better than solution 1?
It's not about evaluation, if we came across solution 2 while searching we would pick it as better because it has more jobs assigned. It is more about how to navigate the solution space to even reach that solution.
One option used in many approaches would be to allow for cost degradation in the search so you could "deteriorate" solution 1 until it becomes possible timewise to add the last unassigned job. The problem of this approach is that:
- it is the opposite of our current approach that is designed so that the local search will converge very fast to the closest local minima;
- it requires a lot of tuning for how much degradation you allow, for how long you allow to wander in poor areas of the solution space before stopping etc.
- there is still no guarantee solution 2 would be found!
The simplest idea I can think of without bending too much the current approach would be to play with route seeding rules: we have some rules for trying to insert jobs that are hard to fit in last, or once the routes are already packed, quite like in this example. This can include: seed an empty route with the furthest job first, or the job that has the tightest TW, or the highest amount etc. In your example none of those criteria work, but the unassigned job is kind of an outlier in term of geographic position. Maybe a seeding rule like "start with the job that is the more isolated" would work in that case.
Maybe a seeding rule like "start with the job that is the more isolated" would work in that case.
Hum, actually not sure about that really, I just tried bypassing the heuristic and forcing the unassigned job (id 6) in a route. Then the local search quickly swaps that (expensive) job for a cheaper one and we're back to the initial problem with an even worse solution: 2 unassigned!
My last comment actually gave me another idea: if you add a high priority to that unassigned job in the first place, then we manage to insert it and we reach the solution with all jobs assigned. This is because we have local search operators aiming at improving priority score regardless of the cost increase. Not sure how this can help in a general workflow.
Hi, thanks a lot for your help.
Yes, the problem with assigning a high priority to the unassigned job is that before solving the problem I don't that that specific job will be unassigned.
What if I assign a high priority to all jobs? In other words, is assigning priority=100 to all jobs generating an "algorithmic difference" compared to not assigning priorities? Would heuristics and local search follow a different path in those two scenario?
is assigning priority=100 to all jobs generating an "algorithmic difference" compared to not assigning priorities?
No, the priority difference across jobs would have an impact but you'd loose that if all jobs have the same priority. What you could try is to re-optimize after adding a high priority to unassigned jobs only to see if by any chance more tasks are assigned. But this really sounds a bit random.
ok, thanks for the suggestion, will try that.
Do you think you'll furtherly investigate this suboptimality to make it part of future developments?
Do you think you'll furtherly investigate this suboptimality to make it part of future developments?
Except for that single ticket, we have no other report of this kind of problem among our own API users, so frankly that would be very low priority at the moment.
@cv-or could you run your problem instance one time with v1.8 (if possible feature-wise)?
In my personal experience, v1.8 delivers more consistent (good, but not perfectly consistent) results (e.g. when changing only 1 job) considering unassigned jobs than v1.13 & v1.14, particularly when it comes to using complicated skills combinations.
v1.13 & v1.14 are generally faster and seem to have better (less km) results (especially for easy problems with few restrictions) but at the cost of oftentimes not assigning all jobs where v1.8 would assign more jobs.
The "balance" between the optimization objectives seems a little obfuscated/unclear across the versions here, like you cannot maneuver them easily and/or due to the heuristics used it's decided automatically/unintentionally.
Just for reference, VRP adds a really nice logic about optimization objectives. E.g. in the real world - and that goes I think for most problems out there - you'd always want to assign all jobs first. In the end, that's probably what you're paid for, regardless of the number of kilometers. Only second goes saving km and only third saving entire vehicles. For the latter it's not even so clear as using a new vehicle might have a fixed cost.
Long story short: I think this issue is part of a bigger problem with unassigned jobs.
Linking:
- https://github.com/VROOM-Project/vroom/issues/827
- https://github.com/VROOM-Project/vroom/issues/883
- https://github.com/VROOM-Project/vroom/issues/671
The initial post here was about getting different solutions with different input setups, not across versions which is a bit different.
The "balance" between the optimization objectives seems a little obfuscated/unclear across the versions here
This is definitely not obvious from the docs, but we try by order of importance to:
- maximize overall priority score (sum of priorities across assigned tasks).
- maximize number of assigned tasks.
- minimize total solution cost.
This objective hierarchy has always been consistent across versions.
v1.13 & v1.14 are generally faster [...] at the cost of oftentimes not assigning all jobs where v1.8 would assign more jobs
Hum, we have to be careful not to fall for survivorship bias here. Let's say for the sake of simplicity that regarding assigned jobs, solutions are either good (think optimal number of task assigned) or bad (suboptimal number of tasks assigned), then on any given instance:
- if v1.8 was good and v1.14 is still good, no-one will complain;
- if v1.8 was bad and v1.14 is still bad, no-one will complain either (except in the rare situation where there is a way to spot suboptimality as for this ticket);
- if v1.8 was bad and v1.14 is good, that's great but in practice no-one ever opens a ticket or calls you up in this situation.
If we only look at tickets and "regression" reports, we only see situations where v1.8 was good and v1.14 is now bad (can definitely happen because as we change the internals, we change the solving paths and potential solutions), but we miss the big picture.
My point here is we should analyze things carefully based on a meaningful set of instances. The closest I have at hand in the usual benchmarks are the CVRP instances where not all jobs are always assigned. Here is a quick summary of the comparison across versions and exploration levels for those instances.
| V1.8 x=0 | V1.14 x=0 | V1.8 x=1 | V1.14 x=1 | V1.8 x=2 | V1.14 x=2 | V1.8 x=3 | V1.14 x=3 | V1.8 x=4 | V1.14 x=4 | V1.8 x=5 | V1.14 x=5 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Assigned jobs | 44820 | 44813 | 44941 | 44941 | 44947 | 44948 | 44948 | 44951 | 44950 | 44952 | 44952 | 44954 |
| All jobs solutions | 158 | 157 | 168 | 168 | 169 | 170 | 169 | 170 | 169 | 171 | 170 | 173 |
| Optimal solutions | 13 | 12 | 22 | 23 | 27 | 27 | 34 | 38 | 38 | 43 | 42 | 45 |
Except for x=0, v1.14 consistently provides better results in term of assigned tasks (and in term of instances solved with all tasks). I'm not reporting average gap to optimal solutions (wrt "usual" cost) that has significantly decreased and average computing times that have been drastically reduced.
@do-me do you have a representative set of instances we could investigate to showcase the difference between v1.8 and v1.14?
On a more dev-related note, I've just started working on #874, which I hope will be a way to reach better solutions by diversifying the search. This may definitely have an impact on the "assigned tasks" objective.
Results: v1.8 vs v1.14 unassigned jobs
@jcoupey I finally managed to test quite some real-world instances with roughly 40 - 60 jobs and identify a dozen comparison cases.
Result: in 237 cases, v1.8 is 12x assigning more jobs than v1.14. v1.14 manages only one time to assign more than v1.8. In the other cases the versions behave equally.
This seems to confirm my initial thesis, so it's not survivorship bias. Unless we did something terribly wrong between the versions, I'd guess that v1.8 is simply better at assigning more jobs.
Please see the zip file attached with 13 problems, each with two jsons for v1.8 and v1.14 as the format changed. The instances in 0 dir are the ones where v1.14 wins, in all other v1.8 wins.
Unfortunately I wasn't able to measure the computation time.
Considering km I only used the cases where both versions allocated equally. Result: it's a draw. Most times equal, 2 wins for v1.8 and 2 wins for v1.14.
each with two jsons for v1.8 and v1.14 as the format changed
The changes were made in a backward-compatible way. That is if you do not badly need the distances matrix for solving, you can run v1.14.0 on the same instance (with the matrix top-level key). My guess is that is the case in the instances you shared since all vehicle.max_distance have a value of $2^{31} - 1$ so are never actual binding constraints and thus can be safely removed.
@do-me can you confirm this before I dig any further? I'd like to reproduce on the exact same files to rule out possible "external" errors.
Also I see you stripped out vehicles and jobs locations, probably for obvious privacy reasons. This makes it harder to look at solutions with e.g. https://github.com/VROOM-Project/vroom-scripts/blob/master/src/plot.py. Another option would be to obfuscate actual coordinates by adding the same random offset to all locations in an instance. Or share instances privately if that is an option?
Yes, the distances can be safely removed and have no importance in these instances. I can share some coordinates but prefer to send these via email.
While both problems raised in this ticket share the same symptoms (too many unassigned jobs), they are very different in nature:
- initial problem (first comments) is about reaching a better solution because of a different fleet setup that "helps" the whole search process;
- the concerns in the follow-up discussion with @do-me are more related to a regression in the heuristic behavior on instances with a complex skills setup.
In order to split the concerns, I've ticketed point 2 in #1193. Otherwise I'm inclined to close for point 1: except if we find evidence of this problem happening wildly at scale, I think we have to leave with it as an inevitable consequence of providing sub-optimal solutions.