Group (topological) branches in log output
Description
Parallel branches are interleaved in the (ASCII-)graphical log output, making it hard to read.
Actual Behavior
o 2e4dc019d943 e8c23f0ee5e2 [email protected] 2021-10-20 14:21:01.000 -07:00
: git: don't update public heads for now
: o 77b94b593d68 bb3b69138188 [email protected] 2021-10-13 12:29:47.000 -07:00
: | git: move logic for adding remote to git module
: | o cfd9255b94d3 c48b95a17bdf [email protected] 2021-10-13 16:17:40.000 -07:00
: |/ index: start indexing file-level conflicts
:/|
o | ca796b49f06f b4323a1a35b6 [email protected] 2021-10-13 20:38:39.000 -07:00
:/ docs: describe how to set up commmand-line completion
o c0a26f76426e 543967befc73 [email protected] 2021-10-13 11:08:01.000 -07:00
Expected Behavior
It's much easier to read like this:
o 2e4dc019d943 e8c23f0ee5e2 [email protected] 2021-10-20 14:21:01.000 -07:00
: git: don't update public heads for now
: o cfd9255b94d3 c48b95a17bdf [email protected] 2021-10-13 16:17:40.000 -07:00
:/ index: start indexing file-level conflicts
o ca796b49f06f b4323a1a35b6 [email protected] 2021-10-13 20:38:39.000 -07:00
: docs: describe how to set up commmand-line completion
: o 77b94b593d68 bb3b69138188 [email protected] 2021-10-13 12:29:47.000 -07:00
:/ git: move logic for adding remote to git module
o c0a26f76426e 543967befc73 [email protected] 2021-10-13 11:08:01.000 -07:00
I think the idea is to print all commits on one side of a fork point before the other side(s) of the fork point. The fork points in this graph are ca796b49f06f and c0a26f76426e, although only the latter matters in this case (because there's only one commit on each side before ca796b49f06f)
Specifications
- Platform: All
- Version: 0.4.0
Here's a description of how Git does it: https://github.blog/2022-08-30-gits-database-internals-ii-commit-history-queries/#topological-sorting
Sorry, haven't read through that Github blog in detail so maybe I have this wrong but it seems to me like a simple DFS like approach would work well (enough) here? I personally find this much easier to read --
o c0a26f76426e 543967befc73 [email protected] 2021-10-13 11:08:01.000 -07:00
|
|...o ca796b49f06f b4323a1a35b6 [email protected] 2021-10-13 20:38:39.000 -07:00
| | docs: describe how to set up commmand-line completion
| |
| |---o cfd9255b94d3 c48b95a17bdf [email protected] 2021-10-13 16:17:40.000 -07:00
| | index: start indexing file-level conflicts
| |
| |...o 2e4dc019d943 e8c23f0ee5e2 [email protected] 2021-10-20 14:21:01.000 -07:00
| git: don't update public heads for now
|
>---@ 77b94b593d68 bb3b69138188 [email protected] 2021-10-13 12:29:47.000 -07:00
git: move logic for adding remote to git module
I originally did not differentiate between children vs descendants but then later shoehorned it with --- vs ..., I personally don't have much of a use for that differentiation but I'm not sure if people do.
Also, I'm not really sure how jj handles those relations in code but in the DFS, we could just prefer to render most recently touched subtrees towards the end (or the beginning, if you don't reverse the output), with the working copy getting special preference to be considered most recently touched (and hence get rendered at the last for better visibility). Thoughts?
Just to make the above log more compact (like jj log now),
o c0a26f76426e 543967befc73 [email protected] 2021-10-13 11:08:01.000 -07:00
|...o ca796b49f06f b4323a1a35b6 [email protected] 2021-10-13 20:38:39.000 -07:00
| | docs: describe how to set up commmand-line completion
| |---o cfd9255b94d3 c48b95a17bdf [email protected] 2021-10-13 16:17:40.000 -07:00
| | index: start indexing file-level conflicts
| |...o 2e4dc019d943 e8c23f0ee5e2 [email protected] 2021-10-20 14:21:01.000 -07:00
| git: don't update public heads for now
>---@ 77b94b593d68 bb3b69138188 [email protected] 2021-10-13 12:29:47.000 -07:00
git: move logic for adding remote to git module
The code is in revset_graph_iterator.rs. What makes it complicated is the laziness - we shouldn't walk the whole graph before we start displaying output. That would be fine to do in small repos, but it would make jj log unusably slow on very large repos. That's also the reason Git uses that algorithm they describe there.
Segmented changelog (or similar) might help the scalability issue? https://raw.githubusercontent.com/facebook/sapling/main/eden/scm/slides/201904-segmented-changelog/segmented-changelog.pdf
Just for reference. I don't know if it applies here, but it contains some readable graph examples saying "use depth-first search" to assign numbers.
Segmented changelog (or similar) might help the scalability issue?
Hmm, I think that's a good point whether or not we use segmented changelog. Unlike Git [^1], we could quite efficiently find common ancestors using the commit index. Maybe that would perform well enough and be simpler, but I haven't thought much about it.
[^1]: Before they added their commit graph anyway, which came long after their algorithm for ordering commits nicely in the graph, I think.
It sounds like this function might be useful. It's used by smartlog rendering (via the sort(..., topo) revset).
It's a bit more aggressive than the old sort(topo) implementation: it sorts branches forking off a same commit by length, and sorts branches that merge too. It might be a bit more flexible - if you want to sort branches by date, then it is probably doable via priorities.