bfg-repo-cleaner icon indicating copy to clipboard operation
bfg-repo-cleaner copied to clipboard

Way to disable .gitattributes creation with git lfs conversion?

Open AndrewJDR opened this issue 9 years ago • 25 comments

When using the lfs conversion feature, is there a way to just disable the .gitattributes creation altogether? I'd like to handle the .gitattributes creation myself, because I want to place one .gitattributes in the root of my project rather than having individual .gitattributes files scattered all over which is what bfg does. Thanks!

AndrewJDR avatar Oct 14 '15 04:10 AndrewJDR

+1

atyndall avatar Oct 21 '15 07:10 atyndall

+1 on this. Actually, technically, if I could just specify my own .gitattributes file that would be great. Especially since it seems that the way bfg does it, doesn't seem to always work correctly. For one, I want it in the root of my project, like the above. Also when I do:

bfg --convert-to-git-lfs '*.{png,jpg}' --no-blob-protection lfstest

I want this:

*.png filter=lfs diff=lfs merge=lfs -text *.jpg filter=lfs diff=lfs merge=lfs -text

in the .gitattributes file, and not:

*.{png,jpg} filter=lfs diff=lfs merge=lfs -text

Which apparently doesn't work with the git lfs hooks.

If you use git lfs track '*.{png,jpg}' it correctly inserts the former, and not the latter.

jivadevoe avatar Nov 05 '15 21:11 jivadevoe

I'm having a related problem: I'm not getting a .gitattributes file at all when I use --convert-to-git-lfs. Has a change been made since this issue was created that is undocumented?

lkilcher avatar Nov 18 '15 19:11 lkilcher

I'm having a related problem: I'm not getting a .gitattributes file at all when I use --convert-to-git-lfs. Has a change been made since this issue was created that is undocumented?

No, no changes that would affect the way .gitattributes are generated. Bare in mind that the BFG will generate .gitattributes files within the same folders as the affected files - so you won't necessarily see them at the level of the root project directory.

rtyley avatar Nov 18 '15 22:11 rtyley

+1

I'm having the same problem. The problem is the git-lfs client doesn't understand *.{fits, fits.fz, sqlite3} filter=lfs diff=lfs merge=lfs -text instead it needs the .gitattributes to look like this:

*.fits filter=lfs diff=lfs merge=lfs -text
*.fits.fz filter=lfs diff=lfs merge=lfs -text
*.sqlite3 filter=lfs diff=lfs merge=lfs -text

The work around that I used and others have used is something like this:

git reset --hard
git filter-branch --tree-filter "echo '*.fits filter=lfs diff=lfs merge=lfs -text
*.fits.fz filter=lfs diff=lfs merge=lfs -text
*.sqlite3 filter=lfs diff=lfs merge=lfs -text' > .gitattributes" HEAD

Now, even if the current working directory's .gitattributes file is bad it'll still have the root .gitattributes.

All of that said, I agree the simplest fix is to just an option to not add .gitattributes files. Another option is to actually fix the .gitattributes files so they work with the git-lfs client. Finally, this could be considered a bug in the git-lfs client? But I haven't seen an issue there.

Example repos: https://github.com/lsst/testdata_cfht https://github.com/lsst/afwdata

jmatt avatar Nov 19 '15 23:11 jmatt

I just wanted to echo that git-lfs does not understand the "*.{png,jpg}" style .gitattributes and seems to require separates lines like: *.png filter=lfs diff=lfs merge=lfs -text *.jpg filter=lfs diff=lfs merge=lfs -text

Again if there can be an option not to add .gitattributes at all, that's the simplest thing. I don't mind handling the addition of the .gitattributes file myself in a separate rebase pass after the bfg run.

Ideally, I could specify my own .gitattributes file on the command line, and that would be placed at the root of the project at a commit just prior to the first commit containing a file matched by the .gitattributes file. Having the scattered .gitattributes is not desirable for me, though.

AndrewJDR avatar Nov 20 '15 01:11 AndrewJDR

Out of interest, could the people on this thread (@ajohnson23, etc), who are reporting git-lfs doesn't work for these globs, state their platform? (Windows, Mac, etc)

@technoweenie, I wonder if the glob parsing is platform-dependent - maybe on Windows "*.{png,jpg}" isn't a valid gitattributes pattern?

rtyley avatar Nov 20 '15 07:11 rtyley

For the record, I was on linux and the "*.{png,jpg}" glob format wasn't working.

AndrewJDR avatar Nov 20 '15 07:11 AndrewJDR

Mac OSX 10.11.1 git-lfs/1.0.2 (GitHub; darwin amd64; go 1.5.1) git version 2.6.3

jmatt avatar Nov 20 '15 08:11 jmatt

I just wanted to echo that git-lfs does not understand the "*.{png,jpg}" style .gitattributes and seems to require separates lines like

Git LFS simply writes the input of git lfs track {pattern} to .gitattributes. It's up to Git to interpret what files it runs through the filters.

From the gitattributes man page:

The rules how the pattern matches paths are the same as in .gitignore files; see gitignore(5)

The Pattern Format section doesn't mention *.{png,jpg} explicitly, but mentions:

Otherwise, Git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: wildcards in the pattern will not match a / in the pathname. For example, "Documentation/*.html" matches "Documentation/git.html" but not "Documentation/ppc/ppc.html" or "tools/perf/Documentation/perf.html".

Nothing I'm seeing about fnmatch mentions anything like *.{png,jpg}. But being a c function, I'm also not seeing any user guides around it. The closest I've found is Python's fnmatch package.

technoweenie avatar Nov 20 '15 13:11 technoweenie

@technoweenie I didn't intend to suggest that this was git-lfs's problem, though the way I wrote it, it sure does seem that way. Apologies.

AndrewJDR avatar Dec 01 '15 12:12 AndrewJDR

No worries. It's a reasonable assumption. You're probably better off running bfg multiple times:

bfg --convert-to-git-lfs '*.png' --no-blob-protection lfstest
bfg --convert-to-git-lfs '*.jpg' --no-blob-protection lfstest

That will run through all of the local git objects multiple times, but it should create valid .gitattributes files.

technoweenie avatar Dec 01 '15 12:12 technoweenie

Yep, I ended up doing something similar (used the curly brace form on the commandline, then ran bfg to delete all the scattered .gitattributes files, then did a rebase after and did a split commit with my own .gitattributes in the root directory).

The root issue does remain, though: For most people, it's undesirable to have .gitattributes files scattered all over the repo. It's much easier to manage a single .gitattributes file in the root. So it'd be great if bfg supported that, in addition to perhaps converting the *.{png,jpg} form specified on the command line to the separate line form for the gitattributes file, which git seems to more reliably understand (or understand at all?). Or as another commenter mentioned, simply providing your own gitattributes file when running the bfg command would be a nice option too...

AndrewJDR avatar Dec 01 '15 17:12 AndrewJDR

I agree with @ajohnson23. When you move files with a certain extension to lfs for your repo, it is most likely that you want to do that for any future commits too. So it make sense to have the filters in the .gitattributes file in the root of your repository. @rtyley Do you have a timeline when/if you are going to address this?

Rudy-Tmc avatar Jan 22 '16 07:01 Rudy-Tmc

Yeah I'd like to have the .gitattributes file formatted properly (git lfs seems to handle those '*.{foo,bar}' patterns properly, but git 2.7.0 does not, at least according to git check-attr) too. And an option for a single .gitattributes file in the project root.

Currently I'm working around this by first using bfg's conversion (we have quite a few extensions that we want handled by lfs, and a huge ass repo), then using bfg to remove all .gitattributes files and then using git filter-branch with an index filter that inserts the correct .gitattributes file in all commits.

rlaakkol avatar Feb 10 '16 07:02 rlaakkol

@rtyley Do you have a timeline when/if you are going to address this?

Supporting a user-specified single .gitattributes file in the root folder is reasonable, I'm just short on free time to implement it - I'm not paid to work on this project! If you'd like to support the development of the BFG, you can donate here: https://www.bountysource.com/teams/bfg-repo-cleaner

rtyley avatar Feb 11 '16 16:02 rtyley

See also #143 for an useful test on the limitations of glob expansion by Git itself in .gitignore and .gitattributes.

javabrett avatar Apr 15 '16 12:04 javabrett

A quick and dirty patch to disable the .gitattributes generation is this:

From 8c070f4a4ce4928537b51bd7c854c3293834a2f2 Mon Sep 17 00:00:00 2001
From: Matteo Bertini <[email protected]>
Date: Tue, 5 Apr 2016 23:12:23 +0200
Subject: [PATCH] Quick and dirty patch to avoid the .gitattributes generation

---
 .../src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala    | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala b/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
index 50818e2..d5b25f4 100644
--- a/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
+++ b/bfg-library/src/main/scala/com/madgag/git/bfg/cleaner/LfsBlobConverter.scala
@@ -56,7 +56,7 @@ class LfsBlobConverter(

   override def apply(dirtyBlobs: TreeBlobs) = {
     val cleanedBlobs = super.apply(dirtyBlobs)
-    if (cleanedBlobs == dirtyBlobs) cleanedBlobs else ensureGitAttributesSetFor(cleanedBlobs)
+    if (cleanedBlobs == dirtyBlobs) cleanedBlobs else cleanedBlobs
   }

   def ensureGitAttributesSetFor(cleanedBlobs: TreeBlobs): TreeBlobs = {
-- 
1.9.1

naufraghi avatar Apr 21 '16 09:04 naufraghi

I have .gitattributes in every directory where there was a binary file. I corrected them to the right format manually (all are same). git-lfs works fine. I want to ask this: would it be safe to delete all of those gitattributes files in the subdirectories if I leave the identical one in the top level. I mean, is this same as .gitignore file, no need to copy up and down

pauljohn32 avatar Nov 03 '16 03:11 pauljohn32

@pauljohn32 You can, but you need to guarantee that top level file exists since the first commit, otherwise lfs won’t work when you go back to any point in the history. (Basically that’s how git-lfs-migrate works.)

jjgod avatar Nov 03 '16 04:11 jjgod

@jjgod, not truth. When you go back in history you'll have no top level .gitattributes but there will be a specific one left by the utility. Indeed if one doesn't rewrite history and remove those files.

Vertigo093i avatar Nov 03 '16 12:11 Vertigo093i

That's the second time I'm reading that BFG's handing of .gitattributes breaks going back in history after an LFS migration. The first one is in the LFS migration instructions themselves, specifically

.gitattributes are not generated through out history... So the best bet is to start tracking on the newest version only. This WILL go bad if you go to a previous version of history, branch off, commit, as LFS track files will not know to track at that point.

That doesn't add up. Is BFG is adding .gitattributes files when the removed file first appears, or at the end of the history? In the former case, why would history break?

emersonf avatar Feb 21 '17 07:02 emersonf

I am running into this issue as well. I just thought I'd mention I appreciate the quick and dirty hack but you can also accomplish the same thing by simply running the BFG --convert-to-git-lfs cmd as normal and then afterwards run --delete-files ".gitattributes" (this assumes you don't have any pre-existing .gitattributes files you care about). You can then run a git filter-branch command to copy in a single .gitattributes file to the root throughout history. This is painfully slow but it does work.

traskrogers avatar Feb 01 '18 17:02 traskrogers

For the sake of people finding this faster: git lfs migrate import is the way to go - no need to fight with BFG. See docs. It will rewrite history and handle .gitattributes correctly. Documentation also reminds you to push all objects from the history git lfs push --all <remote> branch.

Jmennius avatar May 12 '22 17:05 Jmennius

Hi @Jmennius , thanks for mentioning the git lfs migrate import, I see its has much clean handling of .gitattributes file. But i have noticed some difference though between BFG and lfs migrate. When i ran both on same repository (on different copies locally) I have found that on BFG copy size of the objects is 1.6G where the one cleaned using lfs migrate is 2.6G size. For both I have used same attributes patterns and after clean up I have also checked lfs migrate info on both copies and they look exactly same (no LFS files left on git history).

Has any one observed this ? any idea what BFG does differently to reduce objects size ?

rgaduput avatar Jun 29 '23 06:06 rgaduput