git_split
git_split copied to clipboard
Doesn't maintain directory structure
Great script, but it doesn't maintain the directory structure in the new repo. All files in the target subdirectory end up in the root of the new repo.
Interesting. I have been unable to reproduce this issue. I added testing and CI the build and it looks like complex directory structures are preserved. Can you provide more reproduction steps?
create a repo with a sample directory: src/thing. then use "src/thing" as the directory param. boom.
I followed the following steps and could not reproduce the issue. The build tests also test for a more complicated setup and is not firing either.
> mkdir -p /tmp/myrepo/src/blah
> touch /tmp/myrepo/src/blah/file1.txt
> touch /tmp/myrepo/src/blah/file2.txt
> touch /tmp/myrepo/src/file2.txt
> touch /tmp/myrepo/src/file3.txt
> touch /tmp/myrepo/file4.txt
> cd /tmp/myrepo
> git init
> git add .
> git commit -m "initial commit"
> git_split.sh /tmp/myrepo master src /tmp/myrepo2.git
> git clone /tmp/myrepo2.git
The result of git_split (myrepo2.git) is a bare git repository. The contents of myrepo2 shows all the files correctly.
is there a chance that the script you're using has unstaged/unmerged changes compared to what's on master here? I can consistently cause the issue here:
myrepo/src/file3.txt
will consistently end up at myrepo2/file3.txt
I noticed your tests used local repositories and not remote. Our use case is as follows:
git_split.sh ssh://base/repo some-branch-name src/thing ssh://base/new-repo
Using the snippet that you pasted, I modified it to fit what we're trying to do. You'll notice that the files at the root of myrepo2
are the files that were previously in myrepo/src/blah
- that's where it's failing. The important bit there is passing src/blah
. The expected result is that /myrepo2/src/blah/...
exists. Here's the script and the output:
script
mkdir -p /tmp/myrepo/src/blah
touch /tmp/myrepo/src/blah/file1.txt
touch /tmp/myrepo/src/blah/file2.txt
touch /tmp/myrepo/src/file2.txt
touch /tmp/myrepo/src/file3.txt
touch /tmp/myrepo/file4.txt
cd /tmp/myrepo
git init
git add .
git commit -m "initial commit"
bash /github/git_split/git_split.sh /tmp/myrepo master src/blah /tmp/myrepo2.git
git clone /tmp/myrepo2.git
ls /tmp/myrepo/myrepo2
output
→ mkdir -p /tmp/myrepo/src/blah
touch /tmp/myrepo/src/blah/file1.txt
touch /tmp/myrepo/src/blah/file2.txt
touch /tmp/myrepo/src/file2.txt
touch /tmp/myrepo/src/file3.txt
touch /tmp/myrepo/file4.txt
cd /tmp/myrepo
git init
git add .
git commit -m "initial commit"
bash /github/git_split/git_split.sh /tmp/myrepo master src/blah /tmp/myrepo2.git
git clone /tmp/myrepo2.git
ls /tmp/myrepo/myrepo2
Initialized empty Git repository in /private/tmp/myrepo/.git/
[master (root-commit) 8cd3cdd] initial commit
5 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 file4.txt
create mode 100644 src/blah/file1.txt
create mode 100644 src/blah/file2.txt
create mode 100644 src/file2.txt
create mode 100644 src/file3.txt
Cloning into '/tmp/git_split.WCER8e/repo_base'...
done.
Creating Repo from /tmp/myrepo src/blah for /tmp/myrepo2.git
Initialized empty shared Git repository in /private/tmp/myrepo2.git/
Already on 'master'
Your branch is up-to-date with 'origin/master'.
Rewrite 8cd3cdd236dfe119b1f664e3afa08cd688300aae (1/1)
Ref 'refs/heads/master' was rewritten
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 220 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To /tmp/myrepo2.git
* [new branch] master -> master
Cloning into 'myrepo2'...
done.
file1.txt file2.txt
The fundamental use case for git_split is that it turns an arbitrary directory of one repo into the root of a new repo. So the file path (/src/blah) would is expected to be the root of the new repo. You are requesting new functionality.
I looked into what it would take to support your use case. If we add --tree-filter to the git filter-branch command of the script with the proper parameters, the path can be preserved in dest repo. I think this would be pretty easily achieved by added a command-line switch to the script "-p, --preserve" to preserve the path.
Did a bit of experimenting with this. Turns out git's filter branch doesn't really support this use case. Closing.
I must disagree. Here's the modified bash script we ended up using:
#!/bin/bash
# Author: Robbie Van Gorkom
# Created: 2012-05-17
#
# This script will convert a directory in a git repository to a repository
# of it very own.
#
# set the variables.
TARGET_GROUP=$1
SRC_REPO="ssh://gerrit/ui-commons"
SRC_DIR="src/$TARGET_GROUP"
SRC_BRANCH=`git rev-parse --abbrev-ref HEAD`
OUTPUT_REPO="ssh://gerrit/ui-$TARGET_GROUP"
TMP_DIR=$(mktemp -d -t git_split)
REPO_BASE=$TMP_DIR/repo_base;
REPO_TMP=$TMP_DIR/repo_tmp;
echo $TMP_DIR
echo ""
if [ ! -e "/web/ui-$TARGET_GROUP" ]
then
gg-gerrit-create-repo -n "ui-$TARGET_GROUP" -d "Contains UI modules for the $TARGET_GROUP module group."
cd /web
git clone $OUTPUT_REPO
fi
# function to cleanup with a message.
function cleanup() {
rm -rf $TMP_DIR
}
# cleans up when ctrl-c is pressed
function control_c {
cleanup
}
cleanup
# handle kill signals
trap control_c SIGINT
# check if help was requested
if [ $(echo " $*" | grep -ciE " [-]+(h|help)") -gt 0 ]
then
cleanup
exit
fi
if [[ -z "$SRC_REPO" ]] || [[ -z "$SRC_DIR" ]] || [[ -z "$OUTPUT_REPO" ]]; then
exit
fi
# clone the repo
git clone $SRC_REPO $REPO_BASE;
# if the clone was not successful, then exit.
if [ $? -ne 0 ]
then
cleanup
echo "Clone failed to run."
exit 1
fi
# if the source dir doesn't exist then exit
if [ ! -e "$REPO_BASE/$SRC_DIR" -o ! -d "$REPO_BASE/$SRC_DIR" ]
then
cleanup
echo "$REPO_BASE/$SRC_DIR doesn't exist or is not a directory."
exit 1
fi
cd $REPO_BASE
git fetch --all
git checkout $SRC_BRANCH
git branch
# turn this repo into just the changes for the oldPath
git filter-branch --prune-empty --subdirectory-filter $SRC_DIR $SRC_BRANCH
# push those changes to the new repo
git push $OUTPUT_REPO $SRC_BRANCH
# move to the target repo and move things around
cd "/web/ui-$TARGET_GROUP"
git reset --hard origin && git pull
git checkout $SRC_BRANCH
mkdir -p $SRC_DIR
git mv -k * $SRC_DIR
git add .
git commit -m "DATA-1276: moved $TARGET_GROUP module group to repo and into $SRC_DIR."
git push origin $SRC_BRANCH
git checkout master
git merge $SRC_BRANCH --no-edit
git push origin master
cp /web/ui-tracking/.gitignore "/web/ui-$TARGET_GROUP"
cp -r /web/ui-tracking/config "/web/ui-$TARGET_GROUP"
echo "# ui-$TARGET_GROUP
The ui-$TARGET_GROUP repo contains ui modules for the \"$TARGET_GROUP\" group.
This group formally resided in ui-commons.
## Tooling
This repository makes use of [swig](https://github.com/gilt/gilt-swig) for building,
publishing, linting, etc. Swig must be initialized in this directory, as it is not
part of the repository. To initialize and get started using swig, open a terminal
and run the following commands:
\`\`\`
npm install @gilt-tech/swig -g
swig init
\`\`\`
To see a list of swig commands and usage info:
\`\`\`
swig
\`\`\`
## Owner
The code ownership is shared; the point of a separate repo is so that people
who are not front-end engineers can review code easily and see changes when
they are made.
" >> test.md
git add .
git commit -m "DATA-1276: adding metadata to ui-$TARGET_GROUP"
git push origin master
# cleanup temp files before exit
cleanup
That makes sense. I hadn't considered moving files after filter was ran. Are you submitting this code for inclusion or working on a pull request?
I wasn't planning on developing a PR, just dropping that code here so anyone else with the same needs could use or analyze it. Our needs for this have passed (we had to split a single repo into about 20) and the likelihood of that being a need anytime soon is very small. If you'd like to derive code from it for git_split, please do!
Great! Thanks. I'll leave this issue open for anybody who wants to stumble on it and +1 if this is a feature they'd like to have added.
Would like to see this feature. oth it might be complicated on multiple runs in case of multiple branches.