libloot icon indicating copy to clipboard operation
libloot copied to clipboard

Rework how group edges are added

Open Ortham opened this issue 1 year ago • 5 comments

I've been making a lot of changes to how sorting works in order to speed it up, and so far sorting using the latest abi-changes branch build is about 290x faster than sorting using the v0.18.2 release (0.18.3 is about 5x faster). Most of the changes can cause libloot to produce different load orders in ways that shouldn't cause any problems, but I like to avoid doing that often, just in case.

Group handling has changed to track whether a group edge involved user metadata or not, and its vertex lookup got optimised, but the general approach it takes hasn't changed, and that's something I want to revisit for two reasons:

  1. Complexity of the current approach
  2. Potential performance improvements

Now's a good time to do this, because changes will probably affect the resulting load order, and so it would be good to lump them in with all the others that change the load orders that libloot produces.

Complexity

The current implementation for adding group edges grew organically after originally being implemented as an extension of "load after" metadata handling, and I think the code is pretty hard to understand and overcomplicated.

Also, unlike the new tie-breaking algorithm, which tries to enforce the old load order and otherwise minimally deviate from it, there's no real goal it has that can help explain why it chooses to add edges in the order it does, or skip those that it does, so it's hard to understand why it does what it does.

Performance

Following on from loot/loot#1370, adding edges between overlapping plugins is now usually the slowest part of sorting, but checking for overlaps is relatively fast (I tested separating the checking from the adding of edges: with my test load order, the overlap section takes 11 seconds, < 1 of which is checking for overlaps), the slow bit is checking for paths going in the other direction to avoid cycles.

Checking for paths can be sped up by reducing the number of edges in the graph. That can be done by not adding edges where there's already a path in the same direction, but that involves checking for a path and my testing suggests it doesn't do enough to counter its own performance hit. The other way to reduce edges is to try adding fewer edges to begin with. Edges overwhelmingly come from two sources: overlaps, and groups.

I can't see how to try adding fewer overlap edges, but the way group edges are added at the moment seems obviously non-optimal: each plugin gets an edge added to every plugin in every group that transitively loads before the group the plugin is in. In many of those cases, that's unnecessary, because the plugins it transitively loads after will also have edges to the plugins they load before. If no edges were skipped and all groups had plugins, you'd only need to add edges between a plugin and the plugins in the groups that its group directly loads after.

Ortham avatar Jan 05 '23 17:01 Ortham