lite icon indicating copy to clipboard operation
lite copied to clipboard

History of renamed files is not followed by split

Open gustaff-weldon opened this issue 6 years ago • 5 comments

We are restructuring our code into multipackage repository, so we move around a lots of files eg. from

/app/templates/foo.hbs -> /packages/foo-package/app/templates/foo.hbs

We want to use downstream readonly repos with the subpackages. The problem we have encountered is that renames are not followed by splitsh-lite, so our history in downstream repos is limited only to commits after a rename.

Reproduction script:

#!/bin/sh

git init split-repo
cd split-repo
mkdir foo

echo "\n:: Creating history"
for i in 1 2 3
do
    echo "$i" > foo/counter.txt
    git add foo
    git commit -m "Updated counter to $i"
done

# split before moving
echo "\n:: Splitting before rename"
splitsh-lite  --prefix foo/ --target refs/heads/before-rename --progress --debug

# move to subfolder
echo "\n:: Moving foo to packages/foo"
mkdir -p packages/foo
git mv foo/counter.txt packages/foo/counter.txt
git commit -m "Moved foo to packages folder"

echo "4" > packages/foo/counter.txt
git add packages
git commit -m "Updated counter to 4"

# split after moving
echo "\n:: Splitting after rename"
splitsh-lite  --prefix packages/foo/ --target refs/heads/after-rename --progress --debug

Script output:

:: Creating history
[master (root-commit) 33bc3fe] Updated counter to 1
 1 file changed, 1 insertion(+)
 create mode 100644 foo/counter.txt
[master cc7a422] Updated counter to 2
 1 file changed, 1 insertion(+), 1 deletion(-)
[master 61d2845] Updated counter to 3
 1 file changed, 1 insertion(+), 1 deletion(-)

:: Splitting before rename
2017/09/08 12:09:18 Splitting refs/heads/master
2017/09/08 12:09:18   From "foo/" to "ROOT"
2017/09/08 12:09:18 Processing commit: 33bc3fee1b11369771b07a3cdfcb784936ee6466
2017/09/08 12:09:18   parents:
2017/09/08 12:09:18   newparents:
2017/09/08 12:09:18   tree is: 073c7953b8056de57b9849a18c5d5e0ddbff0315
2017/09/08 12:09:18   copy commit "33bc3fee1b11369771b07a3cdfcb784936ee6466" "073c7953b8056de57b9849a18c5d5e0ddbff0315" ""
2017/09/08 12:09:18   newrev is: 5772eae3dc9330aaeba9e4b698598eeee9aa210e
2017/09/08 12:09:18 Processing commit: cc7a422d6dc42902037613d7f41a00ff6da136a0
2017/09/08 12:09:18   parents: 33bc3fee1b11369771b07a3cdfcb784936ee6466
2017/09/08 12:09:18   newparents: 5772eae3dc9330aaeba9e4b698598eeee9aa210e
2017/09/08 12:09:18   tree is: ba4285a8b7086fc3c91d2b9208047ed37d94cf05
2017/09/08 12:09:18   copy commit "cc7a422d6dc42902037613d7f41a00ff6da136a0" "ba4285a8b7086fc3c91d2b9208047ed37d94cf05" "5772eae3dc9330aaeba9e4b698598eeee9aa210e"
2017/09/08 12:09:18   newrev is: f8fcd6f8d8f04c6ced25ddc67870f3752dc83ffe
2017/09/08 12:09:18 Processing commit: 61d28451244d86fc2af159b05584d6548fe62f0c
2017/09/08 12:09:18   parents: cc7a422d6dc42902037613d7f41a00ff6da136a0
2017/09/08 12:09:18   newparents: f8fcd6f8d8f04c6ced25ddc67870f3752dc83ffe
2017/09/08 12:09:18   tree is: 9e7921bfc371b1f1d2f8031c7cac0276a93329f2
2017/09/08 12:09:18   copy commit "61d28451244d86fc2af159b05584d6548fe62f0c" "9e7921bfc371b1f1d2f8031c7cac0276a93329f2" "f8fcd6f8d8f04c6ced25ddc67870f3752dc83ffe"
2017/09/08 12:09:18   newrev is: 6037894597c3ee5df0583afcfa634c51c2becb65
3 commits created, 3 commits traversed, in 6ms
6037894597c3ee5df0583afcfa634c51c2becb65

:: Moving foo to packages/foo
[master 61bc4ed] Moved foo to packages folder
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename {foo => packages/foo}/counter.txt (100%)
[master f742fae] Updated counter to 4
 1 file changed, 1 insertion(+), 1 deletion(-)

:: Splitting after rename
2017/09/08 12:09:18 Splitting refs/heads/master
2017/09/08 12:09:18   From "packages/foo/" to "ROOT"
2017/09/08 12:09:18 Processing commit: 33bc3fee1b11369771b07a3cdfcb784936ee6466
2017/09/08 12:09:18   parents:
2017/09/08 12:09:18   newparents:
2017/09/08 12:09:18 Processing commit: cc7a422d6dc42902037613d7f41a00ff6da136a0
2017/09/08 12:09:18   parents: 33bc3fee1b11369771b07a3cdfcb784936ee6466
2017/09/08 12:09:18   newparents:
2017/09/08 12:09:18 Processing commit: 61d28451244d86fc2af159b05584d6548fe62f0c
2017/09/08 12:09:18   parents: cc7a422d6dc42902037613d7f41a00ff6da136a0
2017/09/08 12:09:18   newparents:
2017/09/08 12:09:18 Processing commit: 61bc4ed159b49269d5ccbf66b5303f45ff5779c4
2017/09/08 12:09:18   parents: 61d28451244d86fc2af159b05584d6548fe62f0c
2017/09/08 12:09:18   newparents:
2017/09/08 12:09:18   tree is: 9e7921bfc371b1f1d2f8031c7cac0276a93329f2
2017/09/08 12:09:18   copy commit "61bc4ed159b49269d5ccbf66b5303f45ff5779c4" "9e7921bfc371b1f1d2f8031c7cac0276a93329f2" ""
2017/09/08 12:09:18   newrev is: 178368081122bcc9ef05febf3223431e25562901
2017/09/08 12:09:18 Processing commit: f742faef4b4a536768685b500d80a1426449058d
2017/09/08 12:09:18   parents: 61bc4ed159b49269d5ccbf66b5303f45ff5779c4
2017/09/08 12:09:18   newparents: 178368081122bcc9ef05febf3223431e25562901
2017/09/08 12:09:18   tree is: 2cf1bff97ea95b5774d3040e6386ee8cf51dcbfb
2017/09/08 12:09:18   copy commit "f742faef4b4a536768685b500d80a1426449058d" "2cf1bff97ea95b5774d3040e6386ee8cf51dcbfb" "178368081122bcc9ef05febf3223431e25562901"
2017/09/08 12:09:18   newrev is: b32d1051c7deadbb8fdb222c0c4bb9ee17ed7f46
2 commits created, 5 commits traversed, in 3ms
b32d1051c7deadbb8fdb222c0c4bb9ee17ed7f46

Inspecting history of created branches shows that after rename history is not followed:

> git checkout before-rename
> git log --oneline

6037894 (HEAD -> before-rename) Updated counter to 3
f8fcd6f Updated counter to 2
5772eae Updated counter to 1
> git checkout after-rename
>git log --oneline

b32d105 (HEAD -> after-rename) Updated counter to 4
1783680 Moved foo to packages folder

@fabpot Is there a way to preserve history (ie. follow renames)?

gustaff-weldon avatar Sep 08 '17 10:09 gustaff-weldon

Today I ran into the same issue for the first time, adopting what used to be a monorepo.

Any new recent thoughts available on this matter?

lkraav avatar Sep 07 '18 15:09 lkraav

Well, the splitter is keeping only the packages/foo folder in the subtree split, so it does exactly what it is asked for.

Keeping files outside the root of the subtree in order to show a full history will be an issue for multiple reasons:

  • where should these files be placed in the tree ? they are outside the root
  • if we include external files in older commit due to moving them inside the subtree later (and wanting to have their history), this means that any commit moving a file inside the subtree split would involve rewriting the full history of the subtree split (as we now need to include more files in older commits). That is a big no-go for any continuous split of a monorepo into sub-repos.

stof avatar Oct 02 '19 16:10 stof

When a move into a splitting prefix occurs, one could at that commit:

  • split the original file's history from the source into a subtree within a new orphan branch under, say, /.original-root/ directory (customisable via command-line option)
  • merge such split-history into the target branch (e.g. with a file move in the merge commit)

This would enable files to be moved into the split subtree (with their full history) after the split's initial creation without rewriting the split's history. It does however raise (at least) one problem: where source commits that affected such new file also affected files already within the target split, they will not be unified into a single commit but will instead exist as distinct commits with identical metadata (author, timestamp, commit message etc); this may be acceptable in some cases, and rewriting the history to avoid it may be acceptable in other cases.

eggyal avatar Jan 10 '23 16:01 eggyal

Given that file moves don't exist in Git (they are detected based on similarity of files when generating a diff), I think this would increase the cost of the splitting a lot (the current splitter does not actually care about the diff AFAICT)

stof avatar Jan 10 '23 16:01 stof

I can't speak to the cost, perhaps it's too great (albeit a feature that could be enabled only when desired, e.g. via command-line switch)... but surely only commits that involve the creation of a new file within the prefix need be inspected for file deletions of similar size outside of the prefix before any similarity detection need be performed?

eggyal avatar Jan 10 '23 17:01 eggyal