Wilfred Hughes
Wilfred Hughes
Yeah, this is sometimes called a "slider", and difftastic isn't completely immune. I have been thinking about trying to nudge the diff logic to avoid blank lines, but it might...
This would probably make `split_mostly_unchanged_toplevel` unnecessary, as it's treating two lists as a special case.
Autochrome is closer to this: it does individual toplevel nodes pairwise to find the closest.
Finding a 'nodes are sufficiently similar after this point' for batching would be super effective.
This could go pairwise, treating pairs of sufficiently-similar nodes as worth processing in isolation, especially if they have common unique nodes.
Floyd-Warshall finds the shortest path to all nodes, and I don't want to materialise all nodes.
Common Misconceptions Concerning Heuristic Search Robert C. Holte https://www.aaai.org/ocs/index.php/SOCS/SOCS10/paper/viewFile/2073/2500
Fringe search seems to be slower (see the branch named fringe_search), although as with A* it seems to be exploring more nodes than my Dijkstra implementation, which is fishy.
After a71d6118c it looks like the execution time is primarily from graph node construction, so changing the routing algorithm won't really help.
The current graph limit is 3,000,000 nodes, and the largest sample file handled is 1.3M. Choosing a limit of e.g. 500,000 or smaller would only re-run Dijkstra a few times.