fast-export
fast-export copied to clipboard
Closed hg branches should be closed in resulting git repo
The export process results in all named branches in mercurial being created as open branches in git, whether they are currently open or closed in mercurial.
Ideally the closedness of branches would be preserved, such that hg branches
and git branch
would output the same list.
Ideally the closedness of branches would be preserved, such that hg branches and git branch would output the same list.
Git does not have the concept of a closed branch, so it's not something that can be preserved while converting. Either remove the branch after conversion or remap the branch name to something like "attic/foo" using a branch mapping file.
@frej Sure, but couldn't you at least add an option for this? We've used hg branches for feature branches, 99% of which get closed and merged back into the default branch. There could be an option to delete the branches in git if they are closed in hg.
I'm looking into converting hundreds of hg repos to git, each of which have hundreds or thousands of closed branches. The solution can't be any sort of manual process, that just isn't feasible.
I'm looking into converting hundreds of hg repos to git, each of which have hundreds or thousands of closed branches. The solution can't be any sort of manual process, that just isn't feasible.
I won't oppose a patch adding support for it, but for your needs (which sounds like a one-time conversion), you could let hg-fast-export convert all the branches. Then write a script which extracts the closed branches from hg and just removes them at the git side.
That solution is not practical for incremental conversions, but if that's what you need, patches are welcome.
I have the same issue. Not sure how @murrayju worked around it. Anyway I think a good option would be to tag archive the hg closed branch in git and then delete it from git as suggested here
what you can do is: run
hg log -r "closed()" -T "{branch}\n" > closed.txt
and then in the git repo
git branch -D `cat closed.txt`
what you can do is: run
hg log -r "closed()" -T "{branch}\n" > closed.txt
and then in the git repo
git branch -D `cat closed.txt`
Be careful with this... if a branch was closed and some of it's commits were never merged, those commits become unreachable (dangling) in git, it might not be what you want. You might want to delete only branches that are fully merged and reopen unmerged branches during your export. It depends on your use case.
Be careful with branches which names have been sanitized or mapped during conversion.
Also be careful if you have a branch that was closed and re-opened, it will show up in closed.txt even if the branch is open in mercurial.
Just to reiterate from the @mryan43 's comment if you want to delete only branches that are fully merged use git branch -d
cat closed.txt` instead.
Small addition to @Kogs suggestion:
hg log -r "closed()" -T "{branch}\n" | sed -r 's/ |\?/_/g' | sort | uniq > closed.txt
-
sed -r 's/ |\?/_/g'
=> Replace spaces and question marks with '_'. This sanitizes the Mercurial branch names to be compatible with Git. You can manually correct any issues that might still be left. -
sort | uniq
=> Deduplicates the branches. If you don't do this, Git may give an error.
This might be a better command to find closed branches that have been merged
hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n'
We were solving the same issue and eventually ended up with this query to find closed hg branches:
# closed branch is a branch that contains a closed commit and doesn't have any open head
hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u
Here are some corner-cases that led us to this solution:
$ hg log --graph -T "{rev} [{branch}] {desc}\n"
@ 2 [default] C2
|
_ 1 [default] C1 (close)
|
o 0 [default] C0
$ hg log -r "closed()" -T "{branch}\n" # fails, 'default' branch is open (in rev 2)
default
$ hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u
# no output, correct
$ hg log --graph -T "{rev} [{branch}] {desc}\n"
_ 1 [default] C1 (close)
|
@ 0 [default] C0
$ hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n'
# no output (wrong, branch 'default' is closed in rev 1)
$ hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u
default
However, it is important to say that we are detecting closed branches to move them to hg_closed/
namespace. Had we deleted these, we would lose our history, which is something we prefer to avoid.
If your objective is to eventually delete closed branches, the query hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n'
works better for you (see the second corner-case).
Finally, here is the snippet of code that moves closed branches to proper namespace:
echo "Moving closed branches to hg_closed/"
hg --repository "$HG_REPO" log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u \
| sed -e 's/ \|\.$/_/g' | tr '\n' '\0' | xargs -0 -n1 -I{} git -C "$GIT_REPO" branch -m {} hg_closed/{}
tr ...
is needed to correctly rename branches containing "
(quotation mark)
sed ...
is an attempt to replicate the mangling that fast-export
does to hg branches so that they are compliant with git. The transformation sed
does here is in no way complete nor correct, refer to man git-check-ref-format
for more details.
FWIW here's what I did.
This deletes all fully-merged branches without any regard for their closed status in mercurial.
Note the hard-coded "master
", if your long-lived branch has a different name of you've got multiple long-lived branches, that'd need to be modified.
delete_branches.py
from subprocess import check_output, check_call
import sys
USAGE = """Error: invalid arguments.
usage: python delete_branches.py local_repository [remote]
remote defaults to 'origin' if not given."""
try:
REPO = sys.argv[1]
except IndexError:
print(USAGE)
sys.exit(1)
try:
REMOTE = sys.argv[2]
except IndexError:
REMOTE = 'origin'
if len(sys.argv) > 3:
print(USAGE)
sys.exit(1)
# Ensure we know about all remote branches:
check_call(['git', 'fetch', '--prune', REMOTE], cwd=REPO)
# Get a list of all remote branches that are merged into <REMOTE>/master
output = check_output(
[
'git',
'branch',
'-r',
'--merged',
f'{REMOTE}/master',
'--format',
'%(refname:short)',
],
cwd=REPO,
)
branches = output.decode().splitlines()
# Only delete branches from the given remote, and not HEAD or master:
branches_to_delete = []
for branch_path in branches:
remote, branch = branch_path.split('/', 1)
if remote == REMOTE and branch not in ('HEAD', 'master'):
branches_to_delete.append(branch)
# Delete 'em:
for i, branch in enumerate(branches_to_delete):
print(f"({i+1}/{len(branches_to_delete)}) deleting {REMOTE}/{branch}...")
check_call(['git', 'push', REMOTE, '--delete', branch], cwd=REPO)