esy-issues
esy-issues copied to clipboard
[MID PRI] Relocate artifacts from cache to local project.
Now that @IwanKaramazow has an example of a fast C string replacement implementation, we can more quickly relocate artifacts that could include paths in them. Right now we use python
, and that has some issues:
- It's slower.
- It requires python.
Currently, we only rewrite artifacts to achieve atomicity - moving from a holding cell where we perform the build _insttmp
to _install
- in order to make sure that binaries don't include paths from the global cache's _insttmp
directory, to the global cache's _install
directory (but only on success).
But we can also move artifacts from the global cache's _install
directory to the local project's _install
directory! This has many benefits:
- People can completely purge their cache without having to orphan global installs.
- It will enable support for BuckleScript, which expects its artifacts to be in a very specific location, and can not yet be configured to look in another location.
Just posting some insights from @andreypopp :
"But we can also move artifacts from the global cache's _install directory
to the local project's _install directory!"
> That may be fragile. You need to account for null byte.
> Currently the only replace happens from _inttmp to _install (same length path).
Not sure how this going to work with different length path
(you don't want to corrupt binaries!)(edited)
> If len(src) > len(repl) you can just leave padding but what if it's the otherwise?
I was thinking that we could ensure that the store was something like:
~/.esy/store/_install__________________________really_long -> ~/.esy/store/_install
~/.esy/store/_insttmp__________________________really_long -> ~/.esy/store/_instmp
~/.esy/store/_install
~/.esy/store/_insttmp
Where we always build into the really long directories, and only provide short versions as symlinks for convenience. Then we know that we can relocate to any destination, but ensuring that we create the padded length at the destination site. For example, perhaps you might end up relocating artifacts from the cache to:
└── node_modules/
└── myDep/
├── _install/ -> ./_install_____just_enough_padding___/
└── _install_____just_enough_padding___/
└── bin/app.exe
Also, we only even need to pad like this when we detect that a relocation was even necessary. If no string replace occurred, then no funny directory padding needs to take place.
Another way to smooth this process out, is if you detect up front whether or not the paths will provide enough padding, and then do not use the cache at all when:
- The paths do not allow relocating due to length.
- And some artifact needs to be relocated as part of the build process.
I imagine we can incrementally work our way to the ideal workflow, even if as a first step we just refuse to build or something.
Clarifying some requirements:
- Robust implementation as described, ideally using symlinks to create a "nice" version of every relocated artifact.
- Works with global installs, or local installs.
- Can be enabled or disabled (many would disable because who wants to pay the performance to copy things over when you don't have to).
- Works with various length destination paths, or cache origin paths.
- Works with paths with spaces in them, or other weird unicode things.
- Nothing fundamentally cross platform limiting (although windows support can come later, as long as we don't build ourselves into a corner that is fundamentally at odds with windows).
- It's okay to only support paths up to
x
length, as long as there's a clear error message that says why a particular artifact can't be relocated, and as long as that error occurs if some artifact even had to be relocated in the first place.
Downgrading to mid-pri, not because it's not important but because:
- Getting a fast x-plat relocator is more important than using it to relocate all things to node_modules after install.
- The primary use case is to support bucklescript itself, but you could imagine bs development continuing on top of non-esy for a while, and only using esy for native/opam development.
Still important, but just slightly behind all the other stuff marked HIGH pri.