uv icon indicating copy to clipboard operation
uv copied to clipboard

Too long cache filenames

Open potiuk opened this issue 11 months ago • 17 comments

Sorry - pressed enter too fast.

Some combinations of platform / packages might generate too long cache file names. Tested with 0.1.18 on my mint Linux (reproducible with main airflow`).

[jarek:~/code/airflow]└2 [apache-airflow] main(+4/-4)+ 8s ± uv pip install --upgrade --editable '.[devel-ci]'
   Built file:///home/jarek/code/airflow                                                                                                                                                                                                                                                                                                                                                      Built 1 editable in 5.20s
error: Failed to download: sqlalchemy==1.4.52
  Caused by: Failed to write to the client cache
  Caused by: Failed to persist temporary file to /home/jarek/.cache/uv/wheels-v0/pypi/sqlalchemy/sqlalchemy-1.4.52-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_5_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.msgpack:  File name too long (os error 36)

potiuk avatar Mar 13 '24 13:03 potiuk

Oh wow. But that actually is the filename as on PyPI, right?

charliermarsh avatar Mar 13 '24 14:03 charliermarsh

Oh wow. But that actually is the filename as on PyPI, right?

Certainly: https://pypi.org/project/SQLAlchemy/1.4.52/#files

I think best if you trim+ add hash if the file name exceeds a certain length. That would be probably best solution (it would be a pity to loose the name from cache and replace it with a meaningless hash - though many similar solutions do it exactly this way.

potiuk avatar Mar 14 '24 01:03 potiuk

Yeah, we just need some kind of deterministic encoding since it's content-addressed. (I.e., when we have that wheel name from PyPI, we need to know where to look in the cache.) Definitely doable!

charliermarsh avatar Mar 14 '24 01:03 charliermarsh

It seems like 255 is the common limit on Linux, but that it can be as low as 144 if you have encryption enabled: https://stackoverflow.com/questions/6114301/git-checkout-index-unable-to-create-file-file-name-too-long/6114588#6114588

charliermarsh avatar Mar 14 '24 20:03 charliermarsh

I'm curious what pip does if you try to download that wheel :)

charliermarsh avatar Mar 14 '24 21:03 charliermarsh

Yep. My homedir where I hit it (my linux workstation with Debian Mint) is indeed encrypted. And the reason it is 143 (seems) is explained here https://wiki.archlinux.org/title/ECryptfs#Deficiencies

As of newer version of pip (v23.3+) pip stores hashed names in the cache - avoiding the problem altogether:

Screenshot 2024-03-14 at 22 17 27

potiuk avatar Mar 14 '24 21:03 potiuk

Yeah, there's no inherent reason that we need to use full-fidelity filenames here. It's just helpful for debugging.

charliermarsh avatar Mar 14 '24 22:03 charliermarsh

Suggestion: Only use hash if you've hit too long name issue.

potiuk avatar Mar 14 '24 22:03 potiuk

@charliermarsh have we addressed this with recent changes to the cache?

zanieb avatar Apr 12 '24 19:04 zanieb

No, hasn’t changed yet

charliermarsh avatar Apr 12 '24 20:04 charliermarsh

Not trivial right now because we read the tags from the wheel name in the built wheel cache

charliermarsh avatar Apr 12 '24 21:04 charliermarsh

This can be a real pain, on my machine the uv default cache dir is: C:\Users\DMpassy\AppData\Local\uv\cache\ already 40 chars If we take, for instance, DjangoRestFramework, a super popular django lib C:\Users\DMpassy\AppData\Local\uv\cache\wheels-v0\index\d92ad663158e5f0f\djangorestframework\djangorestframework-3.10.3-py3-none-any\rest_framework\templates\rest_framework\vertical\checkbox_multiple.html 206 chars, not that far off from windows 260 upper limit.

I'm currently struggling to use UV on CI with jenkins due to that:

C:\Users\jenkins\AppData\Local\uv\cache\.tmp60o0NX\.venv\lib\site-packages\setuptools\command\build_py.py:207: _Warning: Package 'django_saml2_auth.templates.django_saml2_auth' is absent from 
the `packages` configuration.
!!
# bunch of other stuff

error: could not create 'build\bdist.win-amd64\wheel\django_saml2_auth\templates\django_saml2_auth': The filename or extension is too long

I'm not sure how it got that long, I imagine it has to do with uv caching + setuptools building.

inoa-dmpassy avatar Apr 17 '24 19:04 inoa-dmpassy

For windows, there might be another solution, I just came across this https://github.com/BurntSushi/ripgrep/blob/master/pkg/windows/README.md, which talks about using a manifest to declare that the application works with paths longer than 260 characters (I suppose the OS libraries change their behaviour then).

egerlach avatar Jun 05 '24 16:06 egerlach

For windows, there might be another solution, I just came across this https://github.com/BurntSushi/ripgrep/blob/master/pkg/windows/README.md, which talks about using a manifest to declare that the application works with paths longer than 260 characters (I suppose the OS libraries change their behaviour then).

Yeah. Note though that, AIUI, the manifest is only 1 of 2 things that are required. The user also needs to apply a registry edit: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry#enable-long-paths-in-windows-10-version-1607-and-later

BurntSushi avatar Jun 05 '24 16:06 BurntSushi

@potiuk Did you manage to use sqlalchemy with uv? I just tried uv with our codebase and it failed on the same package :-/

edit: caption obvious: uv pip compile ... --no-cache works around this issue

adaamz avatar Aug 20 '24 20:08 adaamz

Hello @charliermarsh @zanieb - would it be possible to prioritise that one a bit now?

To add a bit of context - we are now discussing Airlfow 3 approach of managing our complex dependencies and providers, and one of the options considered is to use uv workspace feature for that (as discussed before - I looked at it recently and I think it has all we need now, and I will make a POC to show to other PMC members and development community how uv workspaces might be useful to simplify and standardise our setup for providers.

Now - this one makes my life really hard ... Either I have to remove cache or move the cache somewhere aside of my home directory to another filesystem (but then hard-links don't work) etc. etc. Also for me - this is quite a blocker to propose it - with our number of contributors (we just passed 3000!) - surely there will be some like me who have they home dir encrypted - especially that it's an option in multiple distros that you can select (For me - that came out-of-the box with my Linux mint installation and as a security freak I obviously enabled it).

Maybe a good solution will be to check if the filesystem is encrypted or simply react to "too long filename" and use a hash instead - there is no need to change it in bulk - simply handling the exception when it occurs should be good-enough and have very little effect on debuggability.

Looking forward to having this one solved (I'd do it myself - but I have far too many things on my plate now to learn rust and all the internals of uv.

potiuk avatar Aug 21 '24 08:08 potiuk

Yeah we def need to fix this.

charliermarsh avatar Aug 21 '24 16:08 charliermarsh