nix icon indicating copy to clipboard operation
nix copied to clipboard

Support for Git LFS in private repositories

Open FPtje opened this issue 4 years ago • 14 comments

In Nixpkgs PRs https://github.com/NixOS/nixpkgs/pull/105998 and https://github.com/NixOS/nixpkgs/pull/113580, support for git LFS is added to the Nixpkgs fetchgit function. The problem with fetchgit, however, is that it does not properly support private repositories. Nix' builtins.fetchGit does support private repositories, but it does not seem to support git LFS.

Currently, when trying to builtins.fetchGit a repository with LFS, the following happens:

nix-repl> builtins.fetchGit {url = "[email protected]:my_company/private-lfs-repo.git"; rev = "some_rev";}
Downloading some/lfs/file (123 KB)
Error downloading object: some/lfs/file (a123456): Smudge error: Error downloading some/lfs/file (some_rev): batch request: missing protocol: ""

Errors logged to /home/my-user/nix/gitv2/xxx/lfs/logs/20210309T095658.11111111.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
error: program 'git' failed with exit code 128

Ideally, it should be possible to builtins.fetchGit the repo either with or without downloading the LFS files. In one use case, the LFS files are used for non-vital things, like tests or documentation. The nix derivations do not depend on those files. Not downloading the LFS files would save space. In another use case, the LFS files are needed to build the derivations, and should therefore be downloaded.

It is possible to export GIT_LFS_SKIP_SMUDGE=1 to accomplish the first use case (i.e. fetch private LFS repository without actually downloading the LFS files), but it would be be much nicer to have it as an option of the builtins.fetchGit function.

FPtje avatar Mar 09 '21 09:03 FPtje

#4635 has the potential to fix the first use case by default

the LFS files are used for non-vital things, like tests or documentation.

Did you configure LFS globally in your git user config? I now realize git global user config may affect more places than what I've found with my testing.

roberth avatar Mar 13 '21 21:03 roberth

Did you configure LFS globally in your git user config?

Yes, the following section is present in ~/.gitconfig:

[filter "lfs"]
        clean = git-lfs clean -- %f
        smudge = git-lfs smudge -- %f
        process = git-lfs filter-process
        required = true

FPtje avatar Mar 15 '21 08:03 FPtje

I marked this as stale due to inactivity. → More info

stale[bot] avatar Sep 14 '21 07:09 stale[bot]

For some projects I am working on LFS is crucial. I hope this gets solved soon.

arximboldi avatar Oct 09 '21 11:10 arximboldi

I marked this as stale due to inactivity. → More info

stale[bot] avatar Apr 16 '22 01:04 stale[bot]

Still relevant.

arximboldi avatar Apr 19 '22 08:04 arximboldi

This error also pops up when using a git repository that uses LFS as a flake input, or seemingly even just by having a flake in a repository with LFS (c.f. https://github.com/NixOS/nixpkgs/issues/137998). I didn't expect it, but export GIT_LFS_SKIP_SMUDGE=1 seems to also workaround the problem with flakes, as long as you don't care about the LFS files.

reivilibre avatar Oct 06 '22 19:10 reivilibre

but .... what if you do care about the LFS files .... 🥲 🥲 🥲 🥲 🥲 🥲 🥲

silky avatar Jul 03 '23 12:07 silky

I think the plan for this would be

  • [ ] Merge the libfetchers changes from #6530
  • [ ] Implement LFS support in the fetcher. Could be smudge filter-based + whitelist of smudge filters, or something more hardcoded. (We don't want to allow general smudge support because that's impure, but could be a convenient implementation strategy - or not)
  • [ ] Add a parameter to the git fetcher. I think we'll eventually want three modes
    • Lazy LFS: fetch any LFS file when it is needed. This will tend to be sequential. Most versatile mode, and a sensible default.
    • Eager LFS: fetch all LFS files simultaneously. This will be faster when you know you need all LFS files.
    • No LFS: quick, even if we're copying the whole flake, which we may have to do until the libexpr part of #6530 is figured out. Alternatively, this mode could be a filter of which files to ignore / fetch eagerly / fetch lazily.
  • [ ] Implement the double fetching protocol where we fetch and load flake.nix once to figure out the fetch parameters, and then fetch and load again if needed

roberth avatar Jul 03 '23 16:07 roberth

Gitlab forces free users now to use LFS in many cases, so I guess this will become a lot more relevant.

janvogt avatar Nov 01 '23 00:11 janvogt

AFAIU, this is unspecific to private repos:

builtins.fetchGit {                  
  url = "https://huggingface.co/openlm-research/open_llama_3b";                                                                                                                     
  rev = "141067009124b9c0aea62c76b3eb952174864057";            
};                                                             

...fails in the same way:

...
Downloading pytorch_model.bin (6.9 GB)
Error downloading object: pytorch_model.bin (9ffd42d): Smudge error: Error downloading pytorch_model.bin (9ffd42dc58c4f49154e98bc7796306fde40febef278e99636a240a731d626a4a): batch request: missing protocol: ""

Errors logged to '/home/.../.cache/nix/gitv3/14avjqj1kcsaj6025lqgbr5r4yz680zmj1xzppc13cgxx12i8dj3/lfs/logs/20231227T021723.995860432.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: pytorch_model.bin: smudge filter lfs failed
error:
       … while calling the 'fetchGit' builtin
...

SomeoneSerge avatar Dec 27 '23 02:12 SomeoneSerge

@SomeoneSerge for huggingface this worked for me:

fetchgit {  # from `pkgs`, not `builtins`, may not matter?
  url = "https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2";
  rev = "b70aa86578567ba3301b21c8a27bea4e8f6d6d61";
  hash = "sha256-IAe/tHFB7yqFRF5aRojkNCD8TbKj8XQMt6eEyPmr4HU=";
  fetchLFS = true;
}

newAM avatar Jan 02 '24 02:01 newAM

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flake-lfs-input/40184/2

nixos-discourse avatar Feb 23 '24 13:02 nixos-discourse

Is there currently a workaround for fetching nix flakes input with lfs?

bratorange avatar Aug 20 '24 14:08 bratorange