bfg-repo-cleaner
bfg-repo-cleaner copied to clipboard
Way to disable .gitattributes creation with git lfs conversion?
When using the lfs conversion feature, is there a way to just disable the .gitattributes creation altogether? I'd like to handle the .gitattributes creation myself, because I want to place one .gitattributes in the root of my project rather than having individual .gitattributes files scattered all over which is what bfg does. Thanks!
+1
+1 on this. Actually, technically, if I could just specify my own .gitattributes file that would be great. Especially since it seems that the way bfg does it, doesn't seem to always work correctly. For one, I want it in the root of my project, like the above. Also when I do:
bfg --convert-to-git-lfs '*.{png,jpg}' --no-blob-protection lfstest
I want this:
*.png filter=lfs diff=lfs merge=lfs -text *.jpg filter=lfs diff=lfs merge=lfs -text
in the .gitattributes file, and not:
*.{png,jpg} filter=lfs diff=lfs merge=lfs -text
Which apparently doesn't work with the git lfs hooks.
If you use git lfs track '*.{png,jpg}'
it correctly inserts the former, and not the latter.
I'm having a related problem: I'm not getting a .gitattributes file at all when I use --convert-to-git-lfs
. Has a change been made since this issue was created that is undocumented?
I'm having a related problem: I'm not getting a .gitattributes file at all when I use --convert-to-git-lfs. Has a change been made since this issue was created that is undocumented?
No, no changes that would affect the way .gitattributes
are generated. Bare in mind that the BFG will generate .gitattributes
files within the same folders as the affected files - so you won't necessarily see them at the level of the root project directory.
+1
I'm having the same problem. The problem is the git-lfs client doesn't understand *.{fits, fits.fz, sqlite3} filter=lfs diff=lfs merge=lfs -text
instead it needs the .gitattributes
to look like this:
*.fits filter=lfs diff=lfs merge=lfs -text
*.fits.fz filter=lfs diff=lfs merge=lfs -text
*.sqlite3 filter=lfs diff=lfs merge=lfs -text
The work around that I used and others have used is something like this:
git reset --hard
git filter-branch --tree-filter "echo '*.fits filter=lfs diff=lfs merge=lfs -text
*.fits.fz filter=lfs diff=lfs merge=lfs -text
*.sqlite3 filter=lfs diff=lfs merge=lfs -text' > .gitattributes" HEAD
Now, even if the current working directory's .gitattributes
file is bad it'll still have the root .gitattributes
.
All of that said, I agree the simplest fix is to just an option to not add .gitattributes
files. Another option is to actually fix the .gitattributes
files so they work with the git-lfs client. Finally, this could be considered a bug in the git-lfs client? But I haven't seen an issue there.
Example repos: https://github.com/lsst/testdata_cfht https://github.com/lsst/afwdata
I just wanted to echo that git-lfs does not understand the "*.{png,jpg}" style .gitattributes and seems to require separates lines like: *.png filter=lfs diff=lfs merge=lfs -text *.jpg filter=lfs diff=lfs merge=lfs -text
Again if there can be an option not to add .gitattributes at all, that's the simplest thing. I don't mind handling the addition of the .gitattributes file myself in a separate rebase pass after the bfg run.
Ideally, I could specify my own .gitattributes file on the command line, and that would be placed at the root of the project at a commit just prior to the first commit containing a file matched by the .gitattributes file. Having the scattered .gitattributes is not desirable for me, though.
Out of interest, could the people on this thread (@ajohnson23, etc), who are reporting git-lfs doesn't work for these globs, state their platform? (Windows, Mac, etc)
@technoweenie, I wonder if the glob parsing is platform-dependent - maybe on Windows "*.{png,jpg}" isn't a valid gitattributes pattern?
For the record, I was on linux and the "*.{png,jpg}" glob format wasn't working.
Mac OSX 10.11.1 git-lfs/1.0.2 (GitHub; darwin amd64; go 1.5.1) git version 2.6.3
I just wanted to echo that git-lfs does not understand the "*.{png,jpg}" style .gitattributes and seems to require separates lines like
Git LFS simply writes the input of git lfs track {pattern}
to .gitattributes
. It's up to Git to interpret what files it runs through the filters.
From the gitattributes man page:
The rules how the pattern matches paths are the same as in .gitignore files; see gitignore(5)
The Pattern Format section doesn't mention *.{png,jpg}
explicitly, but mentions:
Otherwise, Git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: wildcards in the pattern will not match a / in the pathname. For example, "Documentation/*.html" matches "Documentation/git.html" but not "Documentation/ppc/ppc.html" or "tools/perf/Documentation/perf.html".
Nothing I'm seeing about fnmatch
mentions anything like *.{png,jpg}
. But being a c function, I'm also not seeing any user guides around it. The closest I've found is Python's fnmatch package.
@technoweenie I didn't intend to suggest that this was git-lfs's problem, though the way I wrote it, it sure does seem that way. Apologies.
No worries. It's a reasonable assumption. You're probably better off running bfg multiple times:
bfg --convert-to-git-lfs '*.png' --no-blob-protection lfstest
bfg --convert-to-git-lfs '*.jpg' --no-blob-protection lfstest
That will run through all of the local git objects multiple times, but it should create valid .gitattributes
files.
Yep, I ended up doing something similar (used the curly brace form on the commandline, then ran bfg to delete all the scattered .gitattributes files, then did a rebase after and did a split commit with my own .gitattributes in the root directory).
The root issue does remain, though: For most people, it's undesirable to have .gitattributes files scattered all over the repo. It's much easier to manage a single .gitattributes file in the root. So it'd be great if bfg supported that, in addition to perhaps converting the *.{png,jpg} form specified on the command line to the separate line form for the gitattributes file, which git seems to more reliably understand (or understand at all?). Or as another commenter mentioned, simply providing your own gitattributes file when running the bfg command would be a nice option too...
I agree with @ajohnson23. When you move files with a certain extension to lfs for your repo, it is most likely that you want to do that for any future commits too. So it make sense to have the filters in the .gitattributes file in the root of your repository. @rtyley Do you have a timeline when/if you are going to address this?
Yeah I'd like to have the .gitattributes file formatted properly (git lfs seems to handle those '*.{foo,bar}' patterns properly, but git 2.7.0 does not, at least according to git check-attr) too. And an option for a single .gitattributes file in the project root.
Currently I'm working around this by first using bfg's conversion (we have quite a few extensions that we want handled by lfs, and a huge ass repo), then using bfg to remove all .gitattributes files and then using git filter-branch with an index filter that inserts the correct .gitattributes file in all commits.
@rtyley Do you have a timeline when/if you are going to address this?
Supporting a user-specified single .gitattributes file in the root folder is reasonable, I'm just short on free time to implement it - I'm not paid to work on this project! If you'd like to support the development of the BFG, you can donate here: https://www.bountysource.com/teams/bfg-repo-cleaner
See also #143 for an useful test on the limitations of glob expansion by Git itself in .gitignore
and .gitattributes
.
A quick and dirty patch to disable the .gitattributes
generation is this:
From 8c070f4a4ce4928537b51bd7c854c3293834a2f2 Mon Sep 17 00:00:00 2001
From: Matteo Bertini <[email protected]>
Date: Tue, 5 Apr 2016 23:12:23 +0200
Subject: [PATCH] Quick and dirty patch to avoid the .gitattributes generation
---
.../src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala b/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
index 50818e2..d5b25f4 100644
--- a/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
+++ b/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
@@ -56,7 +56,7 @@ class LfsBlobConverter(
override def apply(dirtyBlobs: TreeBlobs) = {
val cleanedBlobs = super.apply(dirtyBlobs)
- if (cleanedBlobs == dirtyBlobs) cleanedBlobs else ensureGitAttributesSetFor(cleanedBlobs)
+ if (cleanedBlobs == dirtyBlobs) cleanedBlobs else cleanedBlobs
}
def ensureGitAttributesSetFor(cleanedBlobs: TreeBlobs): TreeBlobs = {
--
1.9.1
I have .gitattributes in every directory where there was a binary file. I corrected them to the right format manually (all are same). git-lfs works fine. I want to ask this: would it be safe to delete all of those gitattributes files in the subdirectories if I leave the identical one in the top level. I mean, is this same as .gitignore file, no need to copy up and down
@pauljohn32 You can, but you need to guarantee that top level file exists since the first commit, otherwise lfs won’t work when you go back to any point in the history. (Basically that’s how git-lfs-migrate works.)
@jjgod, not truth. When you go back in history you'll have no top level .gitattributes but there will be a specific one left by the utility. Indeed if one doesn't rewrite history and remove those files.
That's the second time I'm reading that BFG's handing of .gitattributes
breaks going back in history after an LFS migration. The first one is in the LFS migration instructions themselves, specifically
.gitattributes are not generated through out history... So the best bet is to start tracking on the newest version only. This WILL go bad if you go to a previous version of history, branch off, commit, as LFS track files will not know to track at that point.
That doesn't add up. Is BFG is adding .gitattributes
files when the removed file first appears, or at the end of the history? In the former case, why would history break?
I am running into this issue as well. I just thought I'd mention I appreciate the quick and dirty hack but you can also accomplish the same thing by simply running the BFG --convert-to-git-lfs
cmd as normal and then afterwards run --delete-files ".gitattributes"
(this assumes you don't have any pre-existing .gitattributes
files you care about). You can then run a git filter-branch
command to copy in a single .gitattributes
file to the root throughout history. This is painfully slow but it does work.
For the sake of people finding this faster: git lfs migrate import
is the way to go - no need to fight with BFG.
See docs. It will rewrite history and handle .gitattributes
correctly.
Documentation also reminds you to push all objects from the history git lfs push --all <remote> branch
.
Hi @Jmennius ,
thanks for mentioning the git lfs migrate import
, I see its has much clean handling of .gitattributes
file.
But i have noticed some difference though between BFG and lfs migrate
. When i ran both on same repository (on different copies locally) I have found that on BFG copy size of the objects is 1.6G where the one cleaned using lfs migrate
is 2.6G size.
For both I have used same attributes patterns and after clean up I have also checked lfs migrate info
on both copies and they look exactly same (no LFS files left on git history).
Has any one observed this ? any idea what BFG does differently to reduce objects size ?