git-archive-all.sh
git-archive-all.sh copied to clipboard
'--tree-ish' is broken
I don't understand what the following line (currently at https://github.com/fabacab/git-archive-all.sh/blob/master/git-archive-all.sh#L242 ) attempted to achieve:
TREEISH=$(git submodule | grep "^ .*${path%/} " | cut -d ' ' -f 2)
This is completely
First of all, git submodule
only lists the "direct" submodules, not the "transitive" ones. This may be related to #2. Consider using something like git submodule foreach --recursive pwd
.
The grep
part assumes that the current state of the submodule is clean (the first char is
for "clean",+
for "changes made", etc.). That's not guaranteed. Indeed,git-archive-all.sh --tree-ish
only really makes sense when the given tree-ish is different from HEAD.
The cut
part tries to finish the regex matching that should have been done in grep; see grep -o
.
It doesn't care anywhere about the original --tree-ish
argument at all.
Consider using something like git submodule foreach --recursive pwd.
Git's foreach
command did not exist when this code was written. IIRC, it wasn't available for about two years after this script was published.
I think this line passed the current submodule commit head to the submodule's git archive
command. It's been years; this could probably use some updating.
For short, there's no way to know exact submodule's commit at the master repo's target commit?
I tried to illustrate it.
In case of below:
git archive-all -t a01 archive.tar
[repo A] [repo B] (submodule)
<uncommitted change>
commit a02 (HEAD) ----------> commit b02 (HEAD)
commit a01 (TARGET) ----------> commit b01
git submodule
returns like:
+__HASH_FOR_THE_UNCOMMITTED_CHANGE__ b (heads/master)
git submodule --cached
fix that it points to the <uncommitted change>
, though,
it only returns the submodule's commit associated with the parent repo's current commit:
Now repo A's HEAD
is a02
, and b02
of repo B is bound.
So, now git submodule --cached
returns
+b02xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx b (b02xxxx)
To archive -t a01
, you need to know the bounded repo B's commit b01
;
but there's no way unless you checkout the target commit a01
, right?
Of course, you shouldn't change the working dir condition. You can't know the exact submodule's status in the past.
git submodule --cached
change is below.
But this does not help everything as said above.
--- a/git-archive-all.sh
+++ b/git-archive-all.sh
@@ -249,9 +249,9 @@ fi
if [ $VERBOSE -eq 1 ]; then
echo -n "archiving submodules..."
fi
-git submodule >>"$TMPLIST"
+git submodule --cached >>"$TMPLIST"
while read path; do
- TREEISH=$(grep "^ .*${path%/} " "$TMPLIST" | cut -d ' ' -f 2) # git submodule does not list trailing slashes in $path
+ TREEISH=$(grep "^.* ${path%/} " "$TMPLIST" | sed -e 's/^.//' | cut -d ' ' -f 1) # git submodule does not list trailing slashes in $path
cd "$path"
rm -f "$TMPDIR"/"$(echo "$path" | sed -e 's/\//./g')"$FORMAT
git archive --format=$FORMAT --prefix="${PREFIX}$path" $ARCHIVE_OPTS ${TREEISH:-HEAD} > "$TMPDIR"/"$(echo "$path" | sed -e 's/\//./g')"$FORMAT
Also, git submodule status
returns uncommitted submodule.
This is also a problem.
Means, git submodule add
-ed, but have not commit the change yet.
[repo C] [repo A] [repo B] (submodule)
commit c03 (HEAD) <------------ <uncommitted change> <uncommitted change>
commit a02 (HEAD) ---------> commit b02 (HEAD)
commit a01 (TARGET) ---------> commit b01
In case, git submodule status
returns the editing submodule, C, regardless of --cached
option.
% git submodule
+__HASH_FOR_THE_UNCOMMITTED_CHANGE__ b (heads/master)
c03xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx c (c03xxxx)
% git submodule --cached
+b02xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx b (b02xxxx)
c03xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx c (c03xxxx)
Finally, it needs some command to know submodules' status at the specific commit.
Without such command, --tree-ish
cannot be fixed.
Now, you'd better not rely on --tree-ish
,
but checkout the target commit by yourself and archive HEAD
, sad to say.
How is calling git ls-tree
for each submodule path obtained by git submodule status
? (see #42)
How is calling
git ls-tree
for each submodule path obtained bygit submodule status
? (see #42)
git ls-tree
was not good for sub-submodules (recursively contained submodules).
Instead, now there exists git submodule --recursive --cached
.
Perhaps this may work?
Instead, now there exists
git submodule --recursive --cached
. Perhaps this may work?
Unfortunately, it was not complete, either.
It just checks for the commit of the submodules bounded to the top repo HEAD
.
Fixed the PR to use
-
git ls-tree
, if available (top repo's direct submodules, and also non-direct ones as far as it can) - otherwise,
git submodule --recursive --cached
- if none succeeds, submodule's
HEAD