gitoxide
gitoxide copied to clipboard
octopus-merge (part 5: tree-merge-ORT three-way)
Three-way merging of trees.
Follow-up of #1612.
Tasks
- [x] baseline tests for tree-merging
- [x] successful first tree-merge without conflicts
- [ ] make all tests pass (catch-all task that will be refined on the fly)
- [x] fully parse all information provided by Git in baseline
- [x] update the conflict-lookup table to reflect changes to the tree as well - avoid double conflicts or missed conflicts if something clashes with a newly added rename, for instance.
- [x] conflict-tree by index to allow getting 'next' of ours for lookahead
- [x]
tree::Editor::get()to find unique path names - ~~multi-tree traversal (without wildcard support!)~~ - for now, let's just do it 'the simple' way and perform two diffs
- optimization idea for when there are numbers: a 'brute force' implementation that uses threads would benefit from the ability to re-use object caches of a repo that has seen the base-tree already, but overall, who knows*
- there is tests for merging merge-bases in
diff3-conflict-markers diff3-conflict-markers.sh- be sure to capture the 'empty tree' label , but also other special cases- See also, all the merge options
- [x] fix reversed tests for blobs
- [x] fuzzing for blob merges
- [ ] assure that each case has a motivating test!
- [ ] remove TODOs
- [ ]
Repository::merge_trees - [ ] A trivial
gix merge treeimplementation, based on commits. Maybe create something to easily merge multiple commits while at it (ingix).
Next PR / Outscoped
- Submodule merges are also possible! Maybe outscope it though!
libgit2also doesn't try it. textconvwith context, see this gist for details.- There seem to be different 'tiers' of tools, some don't get
GIT_DIRset, others do. - It also seems that diff-programs get too much context right now, but that depends on how much is passed to them by the caller as
gix-command::Context.
- There seem to be different 'tiers' of tools, some don't get
- How to model virtual-merge-bases? Can be none or many, user should have control over how this is done.
- Actual tree-based merging
- ~~Make blob-merge based conflict handling possible in the tree merge from
gixat least.~~ - not needed for now
Research
Everything is about MergeORT.
- merge-options passed with
-X oursfor instance don't affect tree-related auto-resolutions, just the ones related to content. This could be implemented when there is demand though. - it uses an empty tree if there is no merge-base - we must allow the same.
- it allows for multiple merge-bases, creating a virtual one by merging all merge-bases together using the same algorithm, recursively.
- merges can have conflicts without a individual files being involved, for instance when directory renames clash
- Note that
merge-ORTcannot properly handle renames within renamed directories, ending up with the source of the subdir-rename still present.
❯ git ls-tree -r $(git merge-tree main feat)
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 git-sec-renamed/2
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 git-sec-renamed/7
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 git-sec-renamed/subdir/6
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 git-sec/subdir-renamed/6
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 git-sequencer
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 gix/5
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 h
- Must make sure that possible types of conflicts are properly communicated, to not degenerate information
- It puts conflict-markers in the blobs of the result tree, with annotations to provide additional context
- Need resolution configuration, see
git2::MergeOptions. - data stored by path, and is interned in the map to allow pointer-based comparisons
- merge-info with everything one needs to know, also related to renames
- or conflict information
- it uses a memory-pool/arena to get memory for many paths all at once (and also release it like that)
- paths start out as conflicted, and then can later be changed to non-conflicting if a content-based merged succeeded.
- If it remains conflicts, the meta-data is used to produce an 'as merged as possible' version with conflict markers that can be checked out to the working tree.
- hunks can partially overlap, but can also be resolved line-by line to some extend.
As detailed in #1623, which provides a fix, the failure observed here in the CI test job is actually not due to any of the changes in this PR, and also occurs if CI is re-run on the tip of main. It is instead due to the upgraded runner image not having the headers needed for building with -llzma, which is needed for the xz feature of gix-testtools. This is also entirely unrelated to #1622, which does ~~it~~ not yet affect CI.
Merging #1623 and then rebasing this onto main should fix the test failure here. The other failure here is in the lint job and unrelated.