FR: `jj squash --restore-descendants`
Is your feature request related to a problem? Please describe.
A common request during code review is to split a PR/CL into multiple ones. In that case the goal is two or more changes with the same final content.
jj split is the first step in that direction. Sometimes though we need to introduce a bit of conflict to make the intermediate change build or make sense.
The natural flow is to jj new on the intermediate change, make the changes, jj squash, and then resolve manually the conflicts in the descendant, maybe checking there is no diff from the previous commit.
This is a waste of manual effort though! We know exactly how to resolve the conflicts: by restoring the contents of the original commit.
Describe the solution you'd like
jj squash --restore-descendants which preserves the descendant contents, instead of their diff.
Describe alternatives you've considered
jj diffedit --restore-descendantson the intermediate change does exactly what we need, but it's a diff editor environment, not a working copy.jj restore --from @ --to @- --restore-descendantsshould do the same operation, but is much less intuitive.- Resolving the conflict with
jj restore --from OLDCOMMITworks but is an extra step.
Related FR: https://github.com/jj-vcs/jj/issues/4708.
I'm currently thinking of implementing these two flags for squash:
--restore-target-descendants: Preserve the content (not the diff) when rebasing descendants of the target commitIf the target commit is a descendant of the source commit, this will have no effect.
[aliases:
restore-descendants]
--restore-source-descendants: Preserve the content (not the diff) when rebasing descendants of the source commit (less commonly used)When using
-r, ~~or if the target commit is an ancestor of the source commit,~~ this will have no effect. Update: See below, https://github.com/jj-vcs/jj/issues/6000#issuecomment-2739083929
I think --restore-target-descendants is more useful, hence it is aliased to --restore-descendants.
I have a toy implementation that I'll share unless there are obvious bugs, but I still need to write a lot of tests.
A situation when both flags matter is when restoring into a sibling, e.g. jj squash --from C --into B in the tree:
X
|
B
| Y
| |
| C
| /
A
The first option controls whether the snapshot of X is restored, while the second option concerns the snapshot of Y.
One reason I wrote this up is that the --restore-source-descendants option is a bit weird. It is particularly weird for jj squash --from A --into C if C is a descendant of A, e.g.
X -> A -> B -> C
This should keep B's snapshot unchanged. So, C will get A's changes twice. Depending on how we apply the "same-change rule", this either means it should be left unchanged, or it means that it should become conflicted.
I'd probably just prohibit --restore-source-descendants in this case.
The main reason I thought of --restore-source-descendants is that it seems even weirder if --restore-descendants toggled both --restore-target-descendants --restore-source-descendants. For example, that would also include the weirdness from the case I described above in this comment.
Creating --restore-source-descendants is a way to document and emphasize that --restore-descendants does not imply restoring the source's descendants.
--restore-source-descendants also feels like a natural operation to make; I wonder if others have ideas for use-cases for it.
But if somebody has a better idea, let me know!
Another weird test case is jj squash --from C --into A --restore-target-descendants for
X -> A -> B -> C -> D
It would modify D and A's snapshots, but not B's.
I suppose that jj squash --from C --into A --restore-target-descendants --restore-source-descendants would modify only A's snapshot. So, this weird case contradicts the reasonable-sounding statement I made above (in the draft docs) that "if the target commit is an ancestor of the source commit, --restore-source-descendants will have no effect".
I can vaguely imagine squash --restore-target-descendants into an ancestor being useful, but perhaps I'd forbid this too at first, to simplify the implementation. Not sure.
Meanwhile, jj squash --from C --into A --restore-source-descendants is equivalent to jj squash --from C --into A (neither of these change D's snapshot and in both cases B's snapshot becomes what C's used to be).
I didn't follow most of this discussion of options, probably because of use of the word restore. I can't seem to get a mental model of what is happening, and the long explanations using different verbs seems to indicate that the name doesn't match what you are describing. I think of restore and preserve as conflicting, during a squash.
Perhaps --keep-target-content or --keep-diff or --no-rebase or something else that I can't tell you are trying to convey.
Re naming, I haven't thought about it much recently, but I think there was a discussion about this on Discord recently that I'm meaning to look at. Apart from that, and wishing for something better, I'm still where I was in https://github.com/jj-vcs/jj/discussions/5951#discussioncomment-12443322; the only new thought that comes to mind (and I'll add there) is --keep-descendant-snapshots. Then, the two other options I discussed would be --keep-source-descendant-snapshots and --keep-target-descendant-snapshots.
Re my thoughts above, I'm starting to think that maybe it's better to have --restore-descendants preserve the snapshots for both source and target descendants (and therefore only changes the snapshots of two commits), since that flag would always make sense. Then, we could have --restore-target-descendants that is only valid when the target is not an ancestor of the source. Similarly, we'd have --restore-source-descendants, and at most one of the three options could be used at a time. (Update: Or maybe we'd forget both of the latter to options if either the source is ancestor of the target or vice-versa. In the "good" direction for either option, it'd be a no-op anyway).
It still feels weird to not make --restore-target-descendants the default, but perhaps I'll get used to it.
Apart from that, I was playing with my toy implementation, and it seems very useful. Hopefully, I'll publish it soon.
In fact, if --restore-descendants preserves everything, one can simulate both --restore-target-descendants and --restore-source-descendants quite easily in the situation when these operations are sane:
W
| Z
Y |
| X
| /
A
jj squash --from X --into Y --restore-target-descendants can be simulated by jj rebase -r X -d X-; jj squash --from X --into Y --restore-descendants.
jj squash --from X --into Y --restore-source-descendants can be simulated with ~~jj duplicate X; jj squash --from Xcopy --into Y~~ Update: Actually, it's one operation longer: jj duplicate X; jj squash --from Xcopy --into Y; jj squash -r Z. Once it's implemented, jj rebase -r X -d X- --restore-descendants; jj squash --from X --into Y will be better.
Since naming is hard, does it make sense to add a command/option that restores tree contents from old operation or predecessor, and deprecate --restore-descendants? (something like jj restore -r @..<head> --from-op=@-)
I occasionally end up with conflicts which could be avoided by --restore-descendants, so it's nice if we can fix them up easily after the fact. It will also work if you're editing commit that has descendants.
TLDR: I'm unsure.
First thought: maybe, but it depends on how nice we can make the opset language. @- might not always be the operation you'd expect in concurrent settings. "The last operation that touched this commit" or something from the evolog would be better, though also not 100% reliable; it could always be some unpleasant "merge of concurrent operations". (It may be reliable enough in practice, I'm unsure)
I can also imagine a world where each operation prints it's opid, or there is some other UI that makes it trivial to refer back to the operation you care about, and you'd request the state from the beginning of operation X. I'm not sure a CLI can provide such a UI, really.
Some other version of a more modular and powerful way to specify these kinds of operations, as suggested by @arxanas (TODO: find link), might help too.
Really, "restore the descendants form their state right before the operation" is exactly the kind of thinking the name --restore-descendants came from, and I wonder whether the implied time-travel is part of the reason people find it confusing.
In the world as it is now, it seems to me that a flag like this, whatever it's called, is more convenient. But again, I'm not sure.
Of course, we might also come up with a better name. While I think preserve-descendant-snapshots" (or "-content") might be worth trying and an improvement, I am kind of on the fence about whether it is good enough for the long term, and the fact that you don't immediately feel it measures up is not a great sign.
Aside: here's a write-up I am wondering about putting into the commit as a GHC-style note. I'm posting it here to mainly refer to it from Discord.
Update: wow, it is not formatted well here. Oh well. You can "Quote reply" or "edit" to see the source.
Update 2: Or I can just force it to print the source
> NOTE: Not implementing `--preserve-{target,source}-descendant-content`
> -----------------------------------------------------------------------
>
> We have `jj squash --preserve-descendant-content --from X --into Y` preserve
> the snapshots of both the descendants of `X` and those of the descendants of
> `Y`. There are a few reasons we do *not* have a flag that preserves *only*
> the descendants of the source (or *only* of the target) such as
> `--preserve-target-descendant-content` or
> `--preserve-source-descendant-content`, even though they might seem easy to
> implement at a glance.
>
> (The same argument applies to `jj rebase --preserve-???-descendant-content`.)
>
> Firstly, these flags seem to only be useful in unusual cases. In those cases,
> they could be simulated. Instead of `squash --preserve-target-content`, you
> could do `jj rebase -r X -d all:X-; jj squash --preserve-descendant-content
> --from X --into Y`. Instead of `squash --preserve-source-content`, you could
> do `jj duplicate -r X; jj squash --preserve-descendant-content --from
> copy_of_X --into Y; jj abandon --preserve-descendants X`. (TODO: When `jj
> restore --preserve-descendants` is implemented, this will become 2 commands).
>
> Secondly, the behavior of these flags would get confusing in corner cases,
> when the target is an ancestor or descendant of the source. More
> specifically, for the commit graph `X -> A -> B -> C -> D`, one confusing
> case is `jj squash --from C --into A --preserve-target-content`. Unlike
> every other example of squashing into an ancestor, this would have to modify
> the snapshot of `D` if `C` is non-empty. Even more confusingly, this should
> be independent of the number of commits between `A` and `C`. So, for
> consistency, when squashing into a direct child, `jj squash --from C --into B
> --preserve-target-content` would *also* have to modify the snapshot of `D`
> even though, from a more naive perspective, `D` will seem (after the
> operation) to be exactly the descendant we asked to preserve.
>
> If the user really wants this behavior, performing a rebase-then-squash
> will make more sense to them.
>
> For reference, the other confusing example for that commit graph is `jj
> squash --from A --into C --preserve-source-content`.
>
> Overall, it seems easier, both in terms of a mental model and of testing, to
> simply not provide these special flags and let the user explain their
> intention exactly by using simpler commands if they need this kind of
> behavior.
Really, "restore the descendants form their state right before the operation" is exactly the kind of thinking the name
--restore-descendantscame from, and I wonder whether the implied time-travel is part of the reason people find it confusing.
Perhaps, it's hard to understand because this flag controls the implicit behavior which the user shouldn't normally care. Almost all jj commands do rebase descendants, so I don't pay much attention to that part. The user wouldn't have to know there was a rebase operation (because our internal data model is snapshot-based), I think.
Perhaps, it's hard to understand because this flag controls the implicit behavior which the user shouldn't normally care.
Yes, this! I agree that having a separate command would be useful in more situations and because the option is too implementation dependent.
Perhaps, it's hard to understand because this flag controls the implicit behavior which the user shouldn't normally care. Almost all
jjcommands do rebase descendants, so I don't pay much attention to that part.
From this perspective, we could rename the option to --restore-children or --preserve-children-snapshots to emphasize that the flag only affects the children of the modified commits. So, the operation is still local, but instead of affecting 2 commits, jj squash --restore-children affects 2 commits + their children. The fact that other commits have their snapshot preserved is the consequence of normal rebase rules.
I previously thought it would simplify things to put the consequence that all descendant's snapshots are preserved in the name, but maybe not. (I'm not sure one way or another at the moment)
Stepping back for a moment, while I'm not sure this kind of flag is the best solution for the very long term, I do want to say that I think it'd be very useful to have now.
Let me recall some reasons for this, partially because I'm quite excited about finishing implementing this for squash and I want to know if people have serious reservations.
--restore-descendants (under any name) seems to address many use-cases that are difficult or tedious to accomplish otherwise, especially once it's implemented for all the relevant commands (squash, rebase, and duplicate in addition to the places it's already implemented). For example:
- https://github.com/jj-vcs/jj/pull/4097
- https://jj-vcs.github.io/jj/latest/FAQ/#i-accidentally-changed-files-in-the-wrong-commit-how-do-i-move-the-recent-changes-into-another-commit and its variations
- "Fun facts" section of https://github.com/jj-vcs/jj/issues/3419#issuecomment-2746823520
- This issue. A common case for
jj squash --restore-descendantsis resolving conflicts after reordering commits. It's also weird to havejj diffedit --restore-descendantsbut not be able to dojj new X; <edit files>; jj squash --restore-descendants.
I've certainly missed having this flag many times, and now that jj squash --restore-descendants works in my personal jj, I use it.
Moreover, after my long conversation with myself above, it seems that the cost is minimal: one flag for a few commands, and probably two flags for rebase and duplicate. I also think this one flag's behavior is going to be very straightforward to understand to anyone using it (unlike some more flexible alternatives, that fortunately seem unnecessary).
So, I'm very interested in finding the best name possible for this behavior, and to figure out the best ways to document or explain it, but I'd settle for this functionality under any name rather than take it out because of naming issues (unless and until we come up with some clearly superior solution, which I wouldn't rush; whatever it will be, it'll need quite a bit of thought). After all, even if it is confusing to some people, they can always choose to not use it.
I am regularly annoyed by the lack of the flag in rebase and squash. I think that --restore-descendants is kind of an awful name, but I kinda like the proposed rename in your PR even less, and maybe the proliferation of potential variants is a sign that we do want some more general mechanism. But while we have it on restore we should have it on the other commands, I think. Renaming or restructuring can wait.
Does --no-rebase fit? or --no-adjust
Well, a rebase does happen. --no-adjust-child-snapshots might work; --no-adjust wouldn't work because some things do get adjusted (e.g. if you think of the commit as a patch, the patch does change, unlike a regular rebase).
The older possibilities that seem most relevant to this line of thinking are:
--verbatim-rebase-children, a rebase that copies the snapshot verbatim)--reparent-children. I don't think "reparent" inherently means the right thing, but it also has connotations of "like rebase, but not quite". (In another world, "rebasing" could be called "reparenting", then I guess we'd use "rebase" here). It's a word we could define, and at least some people seem to call this operation reparenting. Maybe Mercurial used this terminology?--no-adjust-child-snapshots(copied from above, for completeness)
Mentioning children (or descendants) seems important for two reasons:
- the commits being acted upon by, say,
jj squashdefinitely get changed in complicated ways jj rebaseandjj duplicate -dwould probably need a flag for preserving the snapshot of the commit(s) being moved, which could be called--no-adjust-snapshot(or something like--preserve-self-snapshotm--restore-self,--no-adjust-self-snapshot?). It would also need--no-adjust-child-snapshotwhen used with--beforeor--after.
I don't think renaming will help because, in my mental model, jj squash --restore-descendants does two separate things. I agree it's useful, though.
BTW, I wonder if --restore-descendants should be a global flag. As I said before, any commands that do snapshot can rebase descendants. It wouldn't make much sense that jj squash can do fake/verbatim rebase of descendants, whereas normal editing can't.
I often do jj new --before and then automatic rebasing screws up what I'm trying to accomplish, so I find the idea intriguing, but it seems like kind of a rabbit hole to start having multiple modes for snapshotting. How would it work with e.g. the Watchman snapshot trigger?
I often do
jj new --beforeand then automatic rebasing screws up what I'm trying to accomplish, so I find the idea intriguing, but it seems like kind of a rabbit hole to start having multiple modes for snapshotting. How would it work with e.g. the Watchman snapshot trigger?
If the user enabled watchman snapshot (or had background snapshot process), there would be nothing they can control. To me, both "snapshot" and "rebase after abandon/squash" are implicit operations, so it's equally unintuitive that we can control the precise behavior of these operations.
Re Yuya's time-travelling idea from https://github.com/jj-vcs/jj/issues/6000#issuecomment-2746754888, I think one approach to make this manageable might be to allow for atomic sequences of operations. I'm not sure what exactly it would look like, but something like:
jj begin transaction abc
jj squash --from X --into Y --label-step fred
jj restore --from X_before_fred --into X
jj end transaction abc
This would probably necessitate some sort of a scripting language, and I'm not sure what the end-user UI would be. OTOH, this would be a powerful new ability for jj to have with advantages beyond --restore-descendants.
I don't think renaming will help because, in my mental model,
jj squash --restore-descendantsdoes two separate things.
I think this is a good point, though I'm not sure about the conclusion (this will be a theme in this comment; I found your comments very helpful, but they led me to different places).
In my mind, jj squash --restore-descendants can be understood in a few different ways. A lot of confusion happens when the name does not match the person's favorite point of view. They are:
- Do two things: the squash, and then
jj restoreits children to older versions from before the squash. This would correspond to naming the option--restore-children, I'm starting to regret not using that name.- 1.5: do the squash and then
jj restoreall descendants to older versions from before the squash. This corresponds to--restore-descendants
- 1.5: do the squash and then
- Do the squash, but instead of the normal rebase rules that preserve the patches of indirectly affected revisions, preserve their snapshots.
--preserve-descendant-contentor...-snapshots - Do the squash, and also rewrite the affected commits' children to preserve their snapshots from before. At the end of this operation, all the other commits get rebased normally, and it just so happens that the normal rules cause their snapshots to be preserved (since the other commit's parents' snapshots are preserved by induction, and their patches are preserved by the usual rules). This would be
--preserve-children-content, it is also closest to the implementation (I would guess, would need to double-check)
Ideally, people would jump between these with ease, but writing it out shows that it's not trivial.
BTW, I wonder if
--restore-descendantsshould be a global flag.
Thanks for this point! Apart from your intended meaning, this helps clarify to me why --restore-descendants is simplest to understand if it treats all children equally. I put it in the long Note in #6141.
For example, I can imagine this being useful for jj file track or something. (Though probably not useful enough to be worth implementing).
It would be unhelpful for some commands, e.g. I think we wouldn't recommend jj absorb --restore-descendants.
As I said before, any commands that do snapshot can rebase descendants.
In theory, this is a great point. In practice, I share Emily's concerns.
In fact, this is one reason for squash --restore-descendants ; I think the practical alternative to the inability of configuring snapshotting with --restore-descendants is to do jj new, edit the commit, and then jj squash --restore-descendants. In other words, it's unfortunate that it seems impractical to enable what people call the "edit" workflow here, but fortunately the "new" workflow can substitute for it.
To me, both "snapshot" and "rebase after abandon/squash" are implicit operations, so it's equally unintuitive that we can control the precise behavior of these operations.
Makes sense, but I'll note that if you take the point of view number 3 above, the operation is completely explicit.
I'll note that if you take the point of view number 3 above, the operation is completely explicit.
Correct. This is still unintuitive to me because it doesn't make sense (in patch-based thinking) that there is a command for "squash, and also rewrite the affected commits' children to preserve their snapshots from before."
(I know it's cheaper implementation-wise, and is useful. I'm just explaining why it's difficult to understand.)
jj begin transaction abc jj squash --from X --into Y --label-step fred jj restore --from X_before_fred --into X jj end transaction abc
For scripting use case, I think this can be achieved by printing operation ID and --no-commit-transaction. For CLI, I don't have any good idea about UX, either.
This is still unintuitive to me because it doesn't make sense (in patch-based thinking) that there is a command for "squash, and also rewrite the affected commits' children to preserve their snapshots from before."
I just want to add that using a command name in the option --restore-descendants makes it really confusing, especially on the restore command.
This is still unintuitive to me because it doesn't make sense (in patch-based thinking) that there is a command for "squash, and also rewrite the affected commits' children to preserve their snapshots from before."
I just want to add that using a command name in the option
--restore-descendantsmakes it really confusing, especially on therestorecommand.
All the alternatives I could come up with or remember are in https://github.com/jj-vcs/jj/pull/6124#issue-2941592719. People don't seem to agree on any of them as clearly better, at least as of today.
At this point, I've decided that I'm not going to try renaming this option at least until the next release in a week, but we can keep discussing it.
I didn't follow most of this discussion of options, probably because of use of the word
restore. I can't seem to get a mental model of what is happening, and the long explanations using different verbs seems to indicate that the name doesn't match what you are describing. I think of restore and preserve as conflicting, during a squash. Perhaps--keep-target-contentor--keep-diffor--no-rebaseor something else that I can't tell you are trying to convey.
Coming here from https://github.com/jj-vcs/jj/discussions/6224#discussioncomment-12713657 , this comment expresses my confusion at reading this thread.
I think --restore-descendants is a confusing name because the only way I can make sense of the word "descendant" in a commit graph is "commits that come after".
But as far as I can tell, those descendant commits would be abandoned by the operation being discussed, rather than being "restored" in any sense?
jj squash --restore-descendantswhich preserves the descendant contents, instead of their diff.
To me, that sounds like we want to restore ancestors while also keeping some changes in a working copy. Which, again, sounds like the opposite of what the name suggests.
I think specifically referring to "contents" or "snapshots" would be a lot less confusing than talking about "descendants".
jj squash --restore-descendantswhich preserves the descendant contents, instead of their diff.To me, that sounds like we want to restore ancestors while also keeping some changes in a working copy. Which, again, sounds like the opposite of what the name suggests.
I think specifically referring to "contents" or "snapshots" would be a lot less confusing than talking about "descendants".
I don't disagree that "restore descendants" is a confusing name, but 1) it's already a jj diffedit and jj restore flag with the same behavior and 2) I think we might not be on the same page about the behavior because it definitely affects descendants, not ancestors.
C
|
B
|
| @
| /
A
|
X
If we run jj squash, @ gets abandoned, A gets edited, and then B and C get rebased on the new A, changing their snapshot.
--restore-descendants would preserve B's and C's snapshots. B and C are descendants of A.
Maybe --preserve-descendant-snapshots would be a better name, but I think adding the flag to jj squash and renaming it across squash/diffedit/restore/abandon should be orthogonal.
I haven't really been following this, but will inflict my thoughts upon you anyway.
I never understood what --restore-descendants did, despite wanting to do exactly what it does multiple times. At least for squash, I think one issue is that you're doing a squash operation and then restoring the descendants to what they were before that operation. Which is weird -- usually when you do jj action --flags, the flags are in terms of that operation, not in terms of the state of things after that operation.
From that perspective, --preserve-descendants is better. Though I'm still going to whine about it, because (1) I'm not sure what's being preserved. Their [snapshot] content? (yes) Their "graph identity"? (no, not with any operation that modifies any graph edges in them or their ancestors, or arguably anywhere in the graph) The textual changes they make? (maybe, unless there are conflicts). Also (2) it's still talking about descendants, when the action is about nodes other than those descendants. It's sort of "transmogrify X and Y with flag --and-do-this-other-thing-to-some-other-nodes".
It probably wouldn't work for all operations, but --in-place best fits my understanding. "Squash these changes into rev X in place, without disrupting anything else." You could argue that it has the same problems as --preserve-descendants, but it best describes my usual intention when I want this behavior.
And at least in my (possibly flawed) understanding, --in-place applies equally well to squash, diffedit, and restore. (As for duplicate, I still don't understand what it's for, so I can't speak to it.)
If that doesn't work, something beginning with --no- would also make sense to me. As in, the normal thing to do is rebasing descendants. Many CLIs have a convention of using --no-whatever to counteract a default. So something like --no-rebase-descendants or just --no-rebase. Admittedly, jj's equivalents are --ignore-working-copy and ignore-immutable, so consistency would be a global option --ignore-descendants.
I will note that --ignore-working-copy also confused me for a long time, and I probably would've understood it better if it were a short form of global --no-snapshot --no-update. In which case a global --no-rebase or --no-rebase-descendants could make sense.
Sorry, I just spewed out a bunch of alternatives to --restore-descendants:
--preserve-descendants--preserve-children(using ilyagr's inductive logic)--in-place--no-rebase/--no-rebase-descendants--no-rebase-children(using ilyagr's inductive logic)- (global)
--ignore-descendants - (global)
--ignore-children(using ilyagr's inductive logic) - (global)
--no-rebase/--no-rebase-descendants - (global)
--no-rebase-children(using ilyagr's inductive logic)
Implicit in some of these options is that jj users are playing a MtG-like game where the turn phases are something like: snapshot, <action>, rebase, log, update. And this issue pertains to an option for suppressing one of those phases, in a context where many of the other phases are already suppressible.
And apologies: I skipped part of the comment thread, and before posting I thought I ought to read it, and I see most of this has been brought up already. In particular, the above describes how I shift between the different points of view depending on the naming. (And my global flag idea is not new either.)
I was rereading the FAQ about this, and this time I understood it better. The option name that came to mind was --isolate, but it starts with i, so maybe --only?
Rereading the last comment, it is similar to the proposed --in-place, along with good old --no-rewrite.
One question I have is: when this flag is used, does the algorithm actually rewrite and restore descendants, or just skip the rewrite? (hey, maybe --skip-rewrite?)
I'm not a fan of the flags --rewrite, --in-place and --isolate. I feel like if I read the name of them I wouldn't be able to work out what it does.
I personally think that --no-rebase is the clearest for squash, but that would mean that the equivalent for rebase is jj rebase --no-rebase (I love how silly it is, but in the end, it is very confusing and thus probably a bad idea).
So maybe I'm in favor of --preserve-children or --preserve-descendants (I do agree with the inductive logic).