scons icon indicating copy to clipboard operation
scons copied to clipboard

Enable hardlinking when storing/retrieving files to/from SCons cache

Open igorsol opened this issue 6 years ago • 3 comments

This is actually a feature request. I use scons cache functionality (--cache=nolinked --cache-dir= options). It works fine but I would like to have an option to enable hardlinking of files stored in the cache. I mean hardlinking both in store and retrieve cases. Of course this option should be ignored if cachedir is on another filesystem but it can be extremelly useful when cachedir is on the same filesystem (like in my case). Such feature will give us two benefits:

  • reduce disk space usage
  • redduce time required to copy file to/from cachedir

igorsol avatar Feb 25 '19 12:02 igorsol

Originated from Stack Overflow: https://stackoverflow.com/questions/54765660/does-scons-cache-supports-hard-links

dmoody256 avatar Feb 26 '19 03:02 dmoody256

This change would enable performance gains but requires special treatment of cases where a file is modified in-place during build. I don't think hard-linking of cached files can be implemented without quirky edge-case behavior, so it's not likely to be a viable default.

A legitimate in-place modification use case is defconfig <-> .config round-tripping in KConfig environments.

I can imagine NoCache becoming mandatory for such files to ensure correctness. Existing cache could be checked for consistency to provide hints about which files should be excluded.

rico-chet avatar Dec 08 '24 13:12 rico-chet

It seems other caching solutions do support hardlinking (see ccache, for example) - with caveats. Compression is not allowed when hard links are used - SCons doesn't do compression of artifacts anyway so that's not a problem at the moment. Here's the snip from the manpage:

=======

hard_link (CCACHE_HARDLINK or CCACHE_NOHARDLINK, see Boolean values above)

If true, ccache will attempt to use hard links to store and fetch cached object files. The default is false.

Files stored via hard links cannot be compressed, so the cache size will likely be significantly larger if this option is enabled. However, performance may be improved depending on the use case.

Warning

Do not enable this option unless you are aware of these caveats:

  • If the resulting file is modified, the file in the cache will also be modified since they share content, which corrupts the cache entry. As of version 4.0, ccache makes stored and fetched object files read-only as a safety measure. Furthermore, a simple integrity check is made for cached object files by verifying that their sizes are correct. This means that mistakes like strip file.o or echo >file.o will be detected even if the object file is made writable, but a modification that doesn’t change the file size will not.

  • Programs that don’t expect that files from two different identical compilations are hard links to each other can fail.

  • Programs that rely on modification times (like make) can be confused if several users (or one user with several build trees) use the same cache directory. The reason for this is that the object files share i-nodes and therefore modification times. If file.o is in build tree A (hard-linked from the cache) and file.o then is produced by ccache in build tree B by hard-linking from the cache, the modification timestamp will be updated for file.o in build tree A as well. This can retrigger relinking in build tree A even though nothing really has changed.

mwichmann avatar Dec 08 '24 19:12 mwichmann