jj icon indicating copy to clipboard operation
jj copied to clipboard

Include nonconflicted files in root tree of file-conflict Git commits

Open amiryal opened this issue 1 year ago • 12 comments

With the aim of better compatibility with tools that look for the state of working-copy files in the colocated Git’s HEAD, I propose to include all the would-be working copy files in the commit, in addition to the .jj-conflict-* trees that are already there for preventing GC.

This came up in a discussion about Nix flakes, and the way it breaks when HEAD@git has file conflicts. It was pointed out that:

Nix flakes actually use file contents from the working copy, but the index to determine which files to include.

Originally posted by @emilazy in https://github.com/martinvonz/jj/discussions/3970#discussioncomment-9878676

In other parts of the discussion, there has been reluctance towards updating Git’s index with the would-be working tree of HEAD@git (that is the parent of the jj working copy). When I suggested to instead change the way file-conflict commits are prepared in Git, it was pointed out that the barrier to do that is not high:

It used to be more friendly to colocated repos until commit 006c764694a2. I don't think there's much harm in changing it to better match the files in the working copy, as long as we also include the .jj-conflict-* subdirectories (they're needed in order to prevent GC).

Originally posted by @martinvonz in https://github.com/martinvonz/jj/discussions/3970#discussioncomment-9887040

amiryal avatar Jun 27 '24 06:06 amiryal

If this is the issue of Git index, maybe we can add materialized file blobs to the index?

git::reset_head() currently resets Git index to the contents of the raw Git commit of the first parent. This wouldn't make sense if the working-copy commit was a merge or the parent had conflicts.

Of course, git commit shouldn't be used in that situation, which is probably the same for .jj-conflict-* trees.

yuja avatar Jun 27 '24 11:06 yuja

Good point! I think you're right that the issue is only with the Git index, but maybe @amiryal can confirm.

I know we have a similar problem when the user uses git switch to update to a conflicted commit, but hopefully that doesn't confuse tooling in the same way.

martinvonz avatar Jun 27 '24 11:06 martinvonz

Are the Git special cases scattered in the code, or in one place?

joyously avatar Jun 27 '24 17:06 joyously

If this is the issue of Git index, maybe we can add materialized file blobs to the index?

I was thinking about this, and I'm conflicted about the idea.

On one hand, it would make some things a lot nicer, and I think it would solve the Nix issue. VS code might show a nicer diff for conflicted commits (I think, haven't tested it). But I'm worried about somebody hitting "Commit" in VS Code or another tool and really messing up their repo.

Currently, you need to add obviously messed up changes (removing all the .jj-conflict dirs) before you can do that. The mess being obvious and git diff looking scary makes the situation less dangerous.

I'm not sure just how bad this is: in theory, recovery is just a jj abandon away, but the user has to notice and understand the problem first.

Another idea: add the working copy to the staffing area, but don't remove the .jj-conflict dirs. This keeps the diff scary, which I think is good.

Also, any such change makes it harder to understand how jj affects the staging area now that the answer is not "never".

So, I wouldn't make this the default at least. I feel like there will be kinks.

ilyagr avatar Jun 27 '24 18:06 ilyagr

but don't remove the .jj-conflict dirs.

Hmm, it might be also good to leave a README (with scalier name), or some fake conflict marker if git commit has a sanity check for that.

Also, any such change makes it harder to understand how jj affects the staging area now that the answer is not "never".

Just to be clear, jj new resets Git index, and my idea is to make it be consistent with the working directory contents materialized by jj new.

yuja avatar Jun 28 '24 01:06 yuja

Just to be clear, jj new resets Git index, and my idea is to make it be consistent with the working directory contents materialized by jj new.

I agree with that, FWIW. I suspect it was just an oversight that we don't already do that.

martinvonz avatar Jun 28 '24 01:06 martinvonz

or some fake conflict marker if git commit has a sanity check for that.

This is a great idea! I think any placeholder conflict marker would solve my worries. We could make the "README" itself a conflicted file (I don't know if this is a good idea, but it would look clever 😸 ).

In principle, the staging area is the one place Git understands conflicts, so with some work we might be able to actually represent all the conflicts correctly and let people resolve them with Git tools.

ilyagr avatar Jun 28 '24 03:06 ilyagr

the issue is only with the Git index, but maybe @amiryal can confirm.

Yes, when evaluating in a Git repository, Nix only considers files that are tracked by Git, and being in the index is considered as being tracked [ref].

amiryal avatar Jun 29 '24 09:06 amiryal

In principle, the staging area is the one place Git understands conflicts, so with some work we might be able to actually represent all the conflicts correctly and let people resolve them with Git tools.

Genius! I will work this into the issue description when I get to it, or open a new one with just this proposal.

amiryal avatar Jun 29 '24 09:06 amiryal

In principle, the staging area is the one place Git understands conflicts, so with some work we might be able to actually represent all the conflicts correctly and let people resolve them with Git tools.

Genius! I will work this into the issue description when I get to it, or open a new one with just this proposal.

I thought about this a bit more, and I think this should definitely be step 2 of the plan. It would require at least one additional change to jj's Git integration, and there will be some decisions to make.

In Git, normally, the way merge conflicts are resolved is by modifying the staging area AFAIU. Currently, jj ignores any changes to the index. If we wanted jj to see the result of Git merge tools, this would have to change: jj would have to look out for changes to the index.

Now, the index is a diff on top of the parent commit AKA the Git HEAD, not the working copy commit. There's no obvious place in jj's model for jj to record any changes to the index that it detects. Two possibilities come to mind: jj could either turn changes to the index into a new commit in between the working copy and the parent, or it could squash the changes into the parent commit. Both behaviors would lead to some slight weirdness, I think.


Meanwhile, I still see no problem with the simpler "step 1" idea: adjust the index to jj's representation of the parent commit, and create a placeholder conflict to prevent the user from accidentally committing. The conflicted file could be named .jj-do-not-resolve-this-conflict or something. In this "step 1" scenario, jj ignores any changes to the index, just as it does now.

ilyagr avatar Jun 30 '24 21:06 ilyagr

Just a quick not that I don't think Git's index supports more than 4 sides, but I may easily misremember that

martinvonz avatar Jun 30 '24 21:06 martinvonz

I think it only supports what we call 2-sided conflicts with 1 base ("ours", "theirs", and "ancestor" in Git parlance):

https://docs.rs/git2/latest/git2/struct.IndexConflict.html

ilyagr avatar Jun 30 '24 21:06 ilyagr

Meanwhile, I still see no problem with the simpler "step 1" idea: adjust the index to jj's representation of the parent commit, and create a placeholder conflict to prevent the user from accidentally committing. The conflicted file could be named .jj-do-not-resolve-this-conflict or something. In this "step 1" scenario, jj ignores any changes to the index, just as it does now.

OK, I updated the OP with this proposal.

amiryal avatar Jul 07 '24 13:07 amiryal

I just ran into this after using jj to create a merge with conflicts and then trying to resolve the conflicts afterwards using jj new and jj describe. Is there a workflow I can use that avoids this?

kjeremy avatar Feb 04 '25 17:02 kjeremy

Some of the issues discussed here should be fixed in the upcoming release (I think tomorrow?). Here's a summary for colocated repos:

  • 2-sided conflicts coming from the parent(s) of the working copy commit will be added as conflicts in the Git index.

  • jj new on a conflicted commit will still produce an unreadable output with git status. The .jjconflict-* directory changes are staged now though (instead of unstaged), so git diff should produce more reasonable results.

  • When editing a merge commit in the working copy, the Git index should generally look the same as if you ran git merge (the changes from the parent commits will all be staged, and conflicts will be added to the index when possible).

  • For Nix flakes, jj new on a conflicted commit should work properly now (the contents of the parent commit will be staged in the Git index, so Nix can see the files now).

scott2000 avatar Feb 04 '25 20:02 scott2000

I've been thinking a bit more about conflict representation in the Git backend. Would it be possible to store the references to the conflict trees in a separate commit instead of storing them in the tree of the conflicted commit itself?

For instance, if the current representation of commit A is this:

commit A:
  tree <conflict-tree>
  parent B
  author <author>
  committer <committer>
  jj:trees <left> <base> <right>

  <description>

tree <conflict-tree>:
  040000 tree <base>   .jjconflict-base-0
  040000 tree <left>   .jjconflict-side-0
  040000 tree <right>  .jjconflict-side-1
  100644 blob <readme> README

We could instead represent it like this:

commit A:
  tree <left>
  parent B
  parent <conflict-tree-commit>
  author <author>
  committer <committer>
  jj:trees-v2 <left> <base> <right>

  <description>

commit <conflict-tree-commit>:
  tree <conflict-tree>
  author <author>
  committer <committer>

  jj: conflict metadata commit

  This commit contains metadata for a conflict created in `jj` (https://github.com/jj-vcs/jj).

tree <conflict-tree>:
  040000 tree <base>   .jjconflict-base-0
  040000 tree <left>   .jjconflict-side-0
  040000 tree <right>  .jjconflict-side-1

This would mean that the conflicted commit would look mostly like a normal commit in Git, so git status wouldn't show a long list of deleted .jjconflict-* files, and editors wouldn't show all of the files in the repo as being new. We could also still have a README file added under .jjconflict/README so that users would notice that it's a conflicted commit. If we choose to allow pushing conflicts to Git remotes in the future, this would also make it less confusing when viewing such a commit in GitHub for instance.

The main issues I can think of with this approach are that shallow checkouts might make it no longer be possible to read a conflicted commit (if the additional parent commit is beyond the depth, so the trees haven't been fetched yet), and that the additional parent commit would have to be signed whenever the conflicted commit is signed. It is also a more complex implementation, and it would require a new header (e.g. jj:trees-v2).

Edit: I made a proof of concept here: #5637.

scott2000 avatar Feb 09 '25 17:02 scott2000

That makes sense to me.

martinvonz avatar Feb 09 '25 19:02 martinvonz