git-filter-repo icon indicating copy to clipboard operation
git-filter-repo copied to clipboard

[Question] Keeping tags & branches when using repo filter

Open emirot opened this issue 2 years ago • 2 comments

Can I keep all tags when filtering to a subdir ?

git filter-repo -f --prune-empty always --subdirectory-filter myfolder --refs  $(git tag -l)

git version 2.39.2 (Apple Git-143)

emirot avatar Jun 13 '23 19:06 emirot

Sorry, I don't understand your question(s). Your summary talks about keeping tags & branches, but your actual text only talks about tags. Also, your example command would keep branches as they are while updating just tags to record the newly rewritten history, which seems more than a little weird. So, can you please explain:

  • Do you want some refs to refer to the old history and some to refer to the new history, meaning you'll have two incompatible histories within the same repository? (I generally think this is a bad idea, but sometimes people want it and have reasons.)
  • If yes, then after filtering history, which refs (branches or tags) do you want to refer to the new history, and which should refer to the old history?
  • For the branches or tags that are meant to refer only to the filtered history, what if no commit in that branch's or tag's history contain any files within the relevant subdirectory? (I'd presume you'd want such a branch or tag to be deleted, as it has no useful history anymore. But is that what you want?)

If you're asking simply out of lack of familiarity with filter-repo, then I'd say get rid of the --refs $(git tag -l) part of your command. filter-repo will rewrite all branches and tags, and keep all of them that had some commits touching the subdirectory of interest, only dropping branches or tags that pre-dated the introduction of that subdirectory. That's what most folks would want, but if you have a special usecase, you'll need to describe it.

newren avatar Jun 15 '23 16:06 newren

Hey @newren,

Thanks for your answer, in fact your answer helps me a lot. I thought I needed to specify refs for some reasons 🤦‍♂️

Just to let you know this is my use case:

I have a big git repo (too many Gb), unfortunately that I cannot change. What I want to do is extract one folder out of it and create another repo out of that. Why ? ArgoCD at the moment does not support shallow clone or sparse checkout and so pull such a large repo is very inefficient and leads to other issues.

It just worked with those steps:

git clone my-big-repo
git filter-repo -f --prune-empty always --subdirectory-filter folder-i-want
git remote add origin my-newrepo/test-a.git
git push --all origin

My new repo only contains that folder :) with all the tags and branches.

Extra:

I wish I could take a shallow clone using --since-date so it takes less time to pull in the first place, but then I m facing another issue which is cannot push from an shallow cloned repo. I've seen couple solution (like unhallow the repo) I ll need to look more into that.

emirot avatar Jun 15 '23 22:06 emirot

Thanks for your answer, in fact your answer helps me a lot...It just worked with those steps: ...

Glad the filtering worked for you. I'll go ahead and close this one out.

newren avatar Jul 02 '24 16:07 newren