action-cached-lfs-checkout icon indicating copy to clipboard operation
action-cached-lfs-checkout copied to clipboard

"git lfs pull" step is silent?

Open jni opened this issue 3 years ago • 9 comments
trafficstars

Hello, and thank you for your incredibly useful action! 😊

As you know from Twitter, we are using your action and still bleeding about 20GB per day, and having trouble tracking down where it's happening. I was trying to understand how often we have cache misses and how much each cache miss costs, but even in a cache miss, it looks like git lfs pull doesn't have any output? (Example here.) Is that expected? I would have expected to see a line like:

Downloading LFS objects: 100% (1/1), 666 KB | 0 B/s

But I notice on my own terminal that lfs does not add a newline/carriage return after that line, and it gets overwritten. Is that what is going wrong with the action? Could it be fixed by appending && echo after the command?

Thank you! 🙏

jni avatar Feb 25 '22 01:02 jni

Is that expected?

If all LFS is already there, that is expected I think. Seems to work correctly.

nschloe avatar Feb 25 '22 08:02 nschloe

If all LFS is already there

But doesn't the cache miss imply that the lfs is not there? I can't find a single build where git lfs pull shows any output, including the first build ever using this action:

https://github.com/napari/napari/runs/5211449270?check_suite_focus=true#step:2:484

Do you have an example build using this action where git lfs pull actually displays output?

jni avatar Feb 25 '22 09:02 jni

Perhaps something is amiss then. If you have a good idea for a fix, I'll be happy to review a PR.

nschloe avatar Feb 25 '22 10:02 nschloe

Not sure if related, but I seem to be missing LFS files in the checked out repository. Looking through the logs I couldn't find any listings of downloaded files either: https://github.com/ModischFabrications/CutSolverFrontend/actions/runs/3340768601/jobs/5531240857

ModischFabrications avatar Oct 27 '22 22:10 ModischFabrications

I'm noticing the same behavior as @jni. My quotas keep rising even though the git lfs pull line shows no output. I can see that the caches are created as they should be, and the build proceeds fine as well. My CI pipeline is reasonably simple, so if somebody would be nice enough to have a look, here's the job in question.

I can see that @jni you're not using this github action anymore, do you mind sharing which workaroud you ended up using?

Ryp avatar Dec 13 '23 13:12 Ryp

We ended up nuking lfs because we concluded it's a scam by GitHub to get us to spend money. 😂

Less flippantly:

  • github penalises lfs bandwidth at absurd rates, while cloning from a "big" repo is free. So we moved our docs build to a separate repo and have had no issues, other than the complexity of the two-repo setup.
  • worse, there's no way to introspect where the bandwidth is going, and you have no control over who spends the bandwidth. If your project has a surge in popularity, suddenly your work can grind to a halt, and there's nothing to do about it but pay up.
  • in Python, pip-installing from a repo, as in pip install git+https://github.com/napari/napari, checks out the lfs files, which means that people installing our project from main suddenly incurred lfs bills, including in other people's CI. This is crazy.

In the interim, I've discovered git-annex, which is similar to lfs but allows you to configure where your files are stored, and one of the options is Cloudflare R2 which has no egress fees. So I would recommend git-annex + cloudflare as the large-file solution for anyone starting on the problem today. From napari/napari#6049:

Anyway, as I investigated this I came across git-annex, which is like lfs only non-scammy. One of the features of git-annex is that you can set your remote storage backend among a huge array of options, including Cloudflare R2 (through rclone), which has no egress costs, and 10GB upload per month free. So I think it would definitely meet our storage requirements on the free tier.

So that is what I would recommend — I suspect changing the setup will be easier than figuring out where your bandwidth is going and plugging leaks, which happen all the time and are often beyond your control.

jni avatar Dec 14 '23 00:12 jni

Thank you very much for the detailed write up! I ended up ripping LFS support out as well and put everything on a separate repo linked with a submodule.

See https://github.com/Ryp/reaper/commit/6dd51167c831f09729d42276f21ddba37e6c2181

Ryp avatar Dec 14 '23 12:12 Ryp