rattler
rattler copied to clipboard
feat: total ordering for the dependency provider
This should fix the total ordering issues by rewriting how ordering works for the case where the name, version, build number are the same, lets call this set of candidates X.
The old method violated total ordering, with specifically that though a > b, b > c, actually a < c which was incorrect.
The new method compares the set of dependencies that are shared for X. By looking at the shared set of dependencies and devising a matrix that creates a t total order for comparisons, more details are in the code. Tie breaks in this case are done by comparing the timestamp still.
fixes: #888
Are there any two packages that would have resulted in a non-total order before? That might make a good test case then?
I could try making traits to get the required fields and make a minimal test like the one wolf suggested. But that would delay the merge a bit. Let me know what you think!
It's not possible to make a vector and call sort on it in a test?
It's not possible to make a vector and call sort on it in a test?
I think you might think it was one of the trait implementations, which was not the problem. It needs to touch the code in the sorter functions.
Note that the code that was panicking begore is still being run (and not panicking now). This is essentially what you are proposing, albeit not in minimal form.
I've made most of the changes, I'm not sure if the added complexity of the traits would be worth it. I could work on that if that is needed in a seperate PR that would not block this change, and the homebrew error. If we can think of an easy integration test to add, I could add that for sure.
cc @wolfv @baszalmstra
Perhaps we can make a simpler integration test by loading our sample repodata.json, sorting sprcific packages like libuuid and pytorch and creating some snapshots out of that?
Perhaps we can make a simpler integration test by loading our sample repodata.json, sorting sprcific packages like libuuid and pytorch and creating some snapshots out of that?
How is that better than the comparison with libsolv that we are currently doing?
That is testing whether we solve similar which indeed requires sorting but doesnt necessarily hit these edge cases. Besides that I think it would be nice to have a test that highlights how we sort similar packages. Having a snapshot that shows how all pytorch variants are sorted for instance would be quite useful and help guide any future changes.
One more final thing: Could you also run the rattler_solve benchmarks to see if we didn't accidentally introduce a performance regression?
Thanks makes sense will do both. Let me make those test in main first in a separate PR, we can make sure the snapshots stay the same.
One more final thing: Could you also run the rattler_solve benchmarks to see if we didn't accidentally introduce a performance regression?
Ok, unfortunately, it seems to decrease performance by quite a margin. One example(but it's across the board):
old:
solve quetz/resolvo time: [187.60 ms 188.11 ms 188.71 ms]
new:
solve quetz/resolvo time: [217.02 ms 219.35 ms 221.94 ms]
I improved the perfomance a tiny bit, but I think we need to try and drop most of the HashMap's and work on index-offsets and pre-allocated Vectors with integer slots directly. @baszalmstra wdyt? Can we look at this on monday?
Caching over solves might improve things as well, at least that's what I would reckon.