fast-export icon indicating copy to clipboard operation
fast-export copied to clipboard

Closed hg branches should be closed in resulting git repo

Open murrayju opened this issue 7 years ago • 11 comments

The export process results in all named branches in mercurial being created as open branches in git, whether they are currently open or closed in mercurial.

Ideally the closedness of branches would be preserved, such that hg branches and git branch would output the same list.

murrayju avatar Dec 19 '16 20:12 murrayju

Ideally the closedness of branches would be preserved, such that hg branches and git branch would output the same list.

Git does not have the concept of a closed branch, so it's not something that can be preserved while converting. Either remove the branch after conversion or remap the branch name to something like "attic/foo" using a branch mapping file.

frej avatar Dec 20 '16 17:12 frej

@frej Sure, but couldn't you at least add an option for this? We've used hg branches for feature branches, 99% of which get closed and merged back into the default branch. There could be an option to delete the branches in git if they are closed in hg.

I'm looking into converting hundreds of hg repos to git, each of which have hundreds or thousands of closed branches. The solution can't be any sort of manual process, that just isn't feasible.

murrayju avatar Dec 20 '16 19:12 murrayju

I'm looking into converting hundreds of hg repos to git, each of which have hundreds or thousands of closed branches. The solution can't be any sort of manual process, that just isn't feasible.

I won't oppose a patch adding support for it, but for your needs (which sounds like a one-time conversion), you could let hg-fast-export convert all the branches. Then write a script which extracts the closed branches from hg and just removes them at the git side.

That solution is not practical for incremental conversions, but if that's what you need, patches are welcome.

frej avatar Dec 22 '16 09:12 frej

I have the same issue. Not sure how @murrayju worked around it. Anyway I think a good option would be to tag archive the hg closed branch in git and then delete it from git as suggested here

gigi81 avatar Jun 22 '17 13:06 gigi81

what you can do is: run

hg log -r "closed()" -T "{branch}\n" > closed.txt

and then in the git repo

git branch -D `cat closed.txt`

Kogs avatar Dec 13 '17 13:12 Kogs

what you can do is: run

hg log -r "closed()" -T "{branch}\n" > closed.txt

and then in the git repo

git branch -D `cat closed.txt`

Be careful with this... if a branch was closed and some of it's commits were never merged, those commits become unreachable (dangling) in git, it might not be what you want. You might want to delete only branches that are fully merged and reopen unmerged branches during your export. It depends on your use case.

Be careful with branches which names have been sanitized or mapped during conversion.

Also be careful if you have a branch that was closed and re-opened, it will show up in closed.txt even if the branch is open in mercurial.

mryan43 avatar Sep 19 '18 16:09 mryan43

Just to reiterate from the @mryan43 's comment if you want to delete only branches that are fully merged use git branch -d cat closed.txt` instead.

pavlovmilen avatar Dec 27 '18 11:12 pavlovmilen

Small addition to @Kogs suggestion:

hg log -r "closed()" -T "{branch}\n" | sed -r 's/ |\?/_/g' | sort | uniq > closed.txt
  • sed -r 's/ |\?/_/g' => Replace spaces and question marks with '_'. This sanitizes the Mercurial branch names to be compatible with Git. You can manually correct any issues that might still be left.
  • sort | uniq => Deduplicates the branches. If you don't do this, Git may give an error.

marklagendijk avatar May 24 '19 11:05 marklagendijk

This might be a better command to find closed branches that have been merged

hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n'

b-dean avatar Oct 24 '19 18:10 b-dean

We were solving the same issue and eventually ended up with this query to find closed hg branches:

# closed branch is a branch that contains a closed commit and doesn't have any open head
hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u

Here are some corner-cases that led us to this solution:

$ hg log --graph -T "{rev} [{branch}] {desc}\n"
@  2 [default] C2
|
_  1 [default] C1 (close)
|
o  0 [default] C0

$ hg log -r "closed()" -T "{branch}\n" # fails, 'default' branch is open (in rev 2)
default
$ hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u
# no output, correct
$ hg log --graph -T "{rev} [{branch}] {desc}\n"
_  1 [default] C1 (close)
|
@  0 [default] C0

$ hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n'
# no output (wrong, branch 'default' is closed in rev 1)
$ hg log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u
default

However, it is important to say that we are detecting closed branches to move them to hg_closed/ namespace. Had we deleted these, we would lose our history, which is something we prefer to avoid. If your objective is to eventually delete closed branches, the query hg log -r 'head() and closed() and parents(merge())' -T '{branch}\n' works better for you (see the second corner-case).

Finally, here is the snippet of code that moves closed branches to proper namespace:

echo "Moving closed branches to hg_closed/"
hg --repository "$HG_REPO" log -r "branch(closed()) - branch(head() and not closed())" -T "{branch}\n" | sort -u \
        | sed -e 's/ \|\.$/_/g' | tr '\n' '\0' | xargs -0 -n1 -I{} git -C "$GIT_REPO" branch -m {} hg_closed/{}

tr ... is needed to correctly rename branches containing " (quotation mark) sed ... is an attempt to replicate the mangling that fast-export does to hg branches so that they are compliant with git. The transformation sed does here is in no way complete nor correct, refer to man git-check-ref-format for more details.

ondrej-stanek-ozobot avatar May 02 '20 18:05 ondrej-stanek-ozobot

FWIW here's what I did.

This deletes all fully-merged branches without any regard for their closed status in mercurial.

Note the hard-coded "master", if your long-lived branch has a different name of you've got multiple long-lived branches, that'd need to be modified.

delete_branches.py

from subprocess import check_output, check_call
import sys

USAGE = """Error: invalid arguments.
    usage: python delete_branches.py local_repository [remote]
remote defaults to 'origin' if not given."""

try:
    REPO = sys.argv[1]
except IndexError:
    print(USAGE)
    sys.exit(1)
try:
    REMOTE = sys.argv[2]
except IndexError:
    REMOTE = 'origin'
if len(sys.argv) > 3:
    print(USAGE)
    sys.exit(1)


# Ensure we know about all remote branches:
check_call(['git', 'fetch', '--prune', REMOTE], cwd=REPO)

# Get a list of all remote branches that are merged into <REMOTE>/master
output = check_output(
    [
        'git',
        'branch',
        '-r',
        '--merged',
        f'{REMOTE}/master',
        '--format',
        '%(refname:short)',
    ],
    cwd=REPO,
)

branches = output.decode().splitlines()

# Only delete branches from the given remote, and not HEAD or master:
branches_to_delete = []
for branch_path in branches:
    remote, branch = branch_path.split('/', 1)
    if remote == REMOTE and branch not in ('HEAD', 'master'):
        branches_to_delete.append(branch)

# Delete 'em:
for i, branch in enumerate(branches_to_delete):
    print(f"({i+1}/{len(branches_to_delete)}) deleting {REMOTE}/{branch}...")
    check_call(['git', 'push', REMOTE, '--delete', branch], cwd=REPO)

chrisjbillington avatar Oct 24 '22 02:10 chrisjbillington